COVID-19

NCHS data visualized

Data last updated on August 21st, 2020
Switch to TL;DR mode

COVID-19's reporting has become the subject of much controversy.

While some are arguing that coronavirus deaths are over reported, others argue that they are actually under reported. Inaccuracies in the data have been brought to light after multiple states had to remove hundreds of people from their records and the CDC's decision to not require testing before reporting deaths as COVID-19 has drawn significant scrutiny.

So why study the data at all if there are so many concerns about it? Because rather then use the latest up to the minute numbers that most sites are relying on, we will be looking at the National Center for Health Statistics (NCHS)'s provisional death count data. Unlike other sources, the provisional death counts may lag behind and report lower numbers initially. However, they deliver the most complete and accurate picture of lives lost to COVID-19 available as they are based on the final data available from the processed death certificates.


Is COVID-19 real?

Before diving into the data to pull out numbers it is important to start by looking at the total death counts from all sources for two reasons:

We can address both of these concerns by ignoring the reported numbers initially and only looking at the overall death counts compared to the average from the previous three years. This will provide us with a visual indication as to whether or not COVID-19 has caused any real change in the number of people dying.

US deaths by week

Total deaths in the United States are up by 14.2%

As of August 21st, there have been 159,865 COVID-19 deaths reported and catalogued.

The immediate (and most important) takeaway is that deaths in the US increased dramatically beginning late March and continued to climb over a three week period until reaching a peak of around 142% of what we expected.

While this appears to initially confirm that these deaths are not simply other illnesses being incorrectly cataloged, it doesn't tell us that this spike in deaths is exclusively due to the corona virus either. What we do know is that as of August 21st, 159,865 death certificates list COVID-19 on them.


Comorbidities

One of the big advantages of using death certificate data is that we can also get information about other medical issues that may have contributed to the death (comorbidities). It turns out that only67% of people were reported to have died from COVID-19 alone and the vast majority had between 2-3 additional medical complications at the time of death.

COVID-19 was the sole cause of death in only 6% of cases

Obviously this complicates things quite a bit in terms of understanding the exact cause of death. But we can look at the various other medical conditions that may have played a role.

COVID-19 Comorbidities

The average coronavirus death certificate lists 2-3 of the following additional conditions on it
(items are sorted by the percentage of certificates containing it)

Age Breakdowns

The sheer amount of red in this graph is immediately noticeable (indicating that the vast majority of people who died from Corona Virus were 65 or older).

However this isn't entirely unexpected as we know that almost 75% of seasonal flu deaths last year were among people 65 and older. And since most of the various strains of coronavirus have very similar effects and symptoms as influenza, we might be able to get a better understanding of how lethal COVID-19 is by looking at the rate of deaths by age groups to see if there are any major differences (such as high death rates among younger ages groups or any other major deviations from the "norm" caused by weakening immune systems and increased medical complications occurring naturally with age).

COVID-19 Deaths by Age Group

Unfortunately the data on death percentages by age for the flu and for coronavirus don't break down into the same age categories so the comparison isn't perfect, but what we can see is that the virus (like the flu) is increasingly likely to kill a person with age.

While it's true that age comes with weakened immune systems and increased risk of pre-existing medical conditions, another set of factors that's may be overlooked is their treatment and location. You may have seen report after report of hospitals offloading elderly patients to nursing homes (despite the increased risks of infection among the elderly). Since death certificates report the location of death comparing COVID-19 deaths to pneumonia deaths by location (since pneumonia is the most common comorbidity).

COVID-19 & Pneumonia deaths by location

Data from Feb 1st - Aug 15th

Perhaps the most important thing to keep in mind when looking at this data is that pneumonia had a big head start over COVID-19 and killed tens of thousands of people before coronavirus really started to claim a lot of lives. So it would be incorrect to look at this chart and conclude that pneumonia is far deadlier than COVID-19.

Of the 309,578 people who died in nursing homes between Feb 1st and Aug 15th, 11.3% had COVID-19 listed on their death certificates (accounting for 22.2% of all coronavirus deaths in the US).

Keeping that in mind, it's unsurprising that pneumonia would have higher death totals in nearly every setting. But what is alarming is that COVID-19 has managed to surpass pneumonia in one key location... Nursing Homes.

Additionally, this number is likely low as nursing home residents who had to be taken to the hospital after contracting COVID-19 would be recorded as a hospital death if they didn't survive. While this makes sense, it does mean that they wouldn't be included in the total deaths of nursing home residents for a particular disease (so the percentage is likely higher than 11.3%).


Pneumonia

Considering the amount of references made to the death rates and overlap between coronavirus and pneumonia it would be good to take a look at that data too.

Comparative Deaths for the Flu, Pneumonia and COVID-19

The darker areas are the portions of deaths caused by just that disease alone.
The lighter areas represent deaths caused by a combination of illnesses

43.7% of all COVID-19 deaths also had pneumonia

What's significant here is the massive spike in pneumonia around the same time as coronavirus deaths (this isn't entirely unexpected however as many viruses can also trigger pneumonia).

Although we do see a slight increase early on in pneumonia cases this could be a due to misdiagnosis (since COVID-19 testing was very difficult to get access to early on and both can cause similar effects on a chest x-ray). However as time went on and testing became more widespread and reliable, the pneumonia cases without COVID-19 settled back down (even dropping slightly) and the biggest spike overlaps with the coronavirus cases.

What is also interesting is that if we re-graph the data to show the amount of change from one week to the next (so any growth is positive on the graph and any recession goes negative) we can see the relationship between COVID-19 and pneumonia a little more clearly.

In many cases, COVID-19 is treated by putting a patient on a ventilator which may be of particular significance here as 86% of all cases of pneumonia that occur while in the hospital is associated with the use of ventilators. It's common enough to even have its own name, Ventilator Associated Pneumonia (VAP). Other studies have also noted that a decrease in consciousness while intubated can raise the risk factor of developing pneumonia significantly (and heavy sedation appears to be common practice at many hospitals for treating COVID-19 as well).


How deadly is Coronavirus?

While the first graph was the overall deaths from all sources in the US, the next question is how does this relate to the number of weekly deaths in which coronavirus was listed on death certificates? To better compare these two graphs we can graph the daily changes in each total (same as the previous graph with COVID-19 and pneumonia).

At its height, COVID-19 accounted for 22.3% of all deaths in the US and an average of 9.2% overall between Feb 3rd and Jul 27th

Looking at the rates of growth for total deaths and coronavirus, it's clear that COVID-19 has been largely responsible for the climb in total deaths. However, one thing that's of interest is that coronavirus related deaths are declining slower than overall deaths.

This could in part be due to people who would have died in the next few months dying early. So far 22.2% of all COVID-19 deaths have happened in nursing homes and over half of the people who move into a nursing home die within 6 months statistically (even without coronavirus) and the average life expectancy is just over a year. So many deaths in such a short time period within this demographic could result in fewer deaths being reported later in the year.

Additionally, various preventative measures such as social distancing and lockdown orders have undoubtedly resulted in fewer deaths in other areas. For instance, fewer people driving during the lockdowns should mean fewer car accidents etc.

Looking at data provided by John Hopkins, we can see that as of Jul 27th 4,290,337 tested positive in the US for coronavirus. Unfortunately there is no way to know the actual number of infected people in the US so this number is unrealistically low. And to complicate matters further, some people who have COVID-19 will die or have already died but have not yet been recorded. That said, given the numbers we have on hand and taking them at face value with no adjustment, an initial guesstimate as to the average mortality rate would be around 3.7%!

Researchers at Stanford have estimated COVID-19's mortality rate to be about 0.25% (bringing it much closer to the mortality rate of the flu than initial predictions)

Since we know that mortality rates increase with age substantially so if we attempt to adjust the numbers to see what the mortality rate is for those around 60 and under the mortality rates drop all the way to ~0.9% instead. However, as stated initially these numbers are VERY unreliable as we don't know the full scope of infected or dead (so providing anything remotely accurate based purely off of the CDC and John Hopkins data sets would be impossible).

Fortunately the researchers at Stanford University were up for the challenge and analysed data from 23 studies around the country using seroprevalence data to make more accurate estimates of total population infection rates and came to the conclusion that the average mortality rate was ~0.25%. When they evaluated for those under the age of 70 they estimated a mortality rate as low as ~0.04%! Keep in mind that due to how recent this is, the findings have not yet been peer reviewed but there are a multitude of other studies with similar results coming out too.

Why the massive difference in percentages? According to newer research, up to 45% of all people who become infected with coronavirus may never even get sick enough to develop symptoms! While this is certainly great news for those worried about contracting the virus, it's a nightmare for those trying to stop the spread (or those trying to get analytics data like mortality rates).


Effects on hospital resources

With the media constantly focused on the rising number of infections due to increases in testing the question should be how is this affecting hospital resources? After all, the entire purpose of sheltering at home, wearing masks and everything else was to "flatten the curve". So how have we done? Thanks to data provided by the Institute for Heath Metrics and Evaluation in partnership with the University of Washington we actually have an attempt to guess at the current infection counts based off of available data and modeling that we can compare to current hospital bed usage data.

Estimated Infections vs Hospitalizations

Keep in mind that infection counts must be extrapolated from the data available with modeling and they are only as accurate as the underlying data and assumptions available.

Although this chart shows the total hospital beds in use, it doesn't answer the question of how strained the system has been as a result of COVID-19. Unfortunatly showing this data on a national level is almomst 100% usless as some states have barely been affected at all while states like New York have been hit relentlessly.

It's also not really fair to break the data down by states as there are hot spots within a state that can be straining resources while other areas are nearly uneffected by comparision. However, this is unfortunatly the best that I can provide for now.

Hospital Resource Usage for

Only 3 states to date would have run the risk of exceeding their hospital bed capacity (based on historical annual averages) without restricting their services.

Although states have restricted hospital access in an attempt to free up beds to deal with spikes in patients due to COVID-19; to date, 48 states (the District of Columbia is being counted a "51st state") have not had enough COVID-19 patients even during their highest spikes to fill all their state's hospital beds (assuming no change in annual average needs).

Obviously this statistic needs to be tempered with the understanding that hospitals are going to fill up around large dense cities faster than rural areas, so just because only 3 states to exceeded their projected capacity doesn't mean that other states don't have hospitals that arn't near or at capacity. However, even if we look at New York City (arguably the epicenter of US coronavirus outbreaks) we'll see that they didn't reach their capacity in reality either.

In fact, one temporary hospital that was built in NYC was closed without ever having treated a single patient and others closed after only treating handfulls of patients. Even the Comfort (a navy hospital ship that docked at NYC to help out) had so few patients that they converted their facilities to assist in COVID-19 patients and ultimately left after treating just 179 people and being told they wern't needed.

Halting many hospital services to make room for COVID-19 patients has also had the side effect of causing very expensive procedures (that would normally bring in money to underwrite much of the daily expenses) to not be offered. As a result, across the country hospitals are being strained financially with at least 29 hospitals declaring bankruptcy as of June 3rd.


How COVID-19 affects various groups

COVID-19 spreads differently in different locations based on a variety of factors such as average age, population density, how the outbreak is handled by the state, etc. The goal of this section is to try and evaluate how COVID-19 affects various states by various sets of criteria.

Trends in deaths by political affiliation

States were split into 5 groups based on how they voted over the last 4 presidential elections

The states that voted for democrats in 75% or more of the last 4 presidential elections are experiencing by far the highest increases in deaths on average

Please keep in mind that graphs like this only show the situation graphically and DO NOT, explain why things appear the way they do. While it would be easy to look at a chart like this and see that the only 2 lines trending upwards and the only areas with large mortality spikes are predominantly democrat states and immediately blame democrats for making the problem worse. A more reasonable assumption may be that since almost every single one of the top 25% most populated states in the US voted for Democrats in 3 or more elections, that COVID-19 would naturally be spreading in those areas due to their population densities.

The primary reason for posting this graph is because Republicans are often blaming the Democrats for overreacting and the Democrats are often blaming the Republicans for not taking the situation seriously enough. Perhaps seeing the effects within these divides may shed some light as to why various groups are reacting the way they are.

Deaths by population density

States were split into 4 groups based on people per square mile

While the trend lines for political affiliation are a bit all over the place, sorting states by population density and splitting them into 4 equal groups results in predictably steadily increasing death counts. It certainly is possible that policies enacted by political parties may be more likely to help or exasperate but the takeaway would seem to be that for a virulent infection, more people in a close environment results in more spread.

Deaths by lockdowns

States that did not lock down vs states that did

Once again, the standard disclaimer applies here as well. We can't look at this and say conclusively that lockdowns help or hurt. But what we can say is that if a person lives in New York City and looks at one of these open states, they should accept the fact that not all areas are being hit equally hard and accept the fact that not locking down wasn't a bad decision. Likewise, it's clear that other states are dealing with more infection rates and it's equally unfair for states that arn't locked down to criticize states that are purely on their own infection and mortality rate.