As an economist without formal training in epidemiology I have done my best to leave the modelling up to the experts. But, the world has shut down around me and my life is suddenly so much more complicated and I have to wonder, is this COVID thing as dangerous as it seems? When things got bad in Italy my optimistic friends said, “that’s just Italy”. When things got bad in Spain, they said the same. But now New York has more deaths per capita than either Italy or Spain and I am starting to sweat a little. Is there something particularly bad about the New York health care system which has made them more vulnerable to this disease than others?
Looking at the mortality rate of previous flu seasons in New
York in which for the last four years, they have been in the top 10 best performing states according to
the CDC. In 2017 New York had the 7th lowest death rate by State, only being beat by states which had lower elderly populations (Utah, Alaska, California, Colorado, Texas, and Washington).
Does New York have a particularly large elderly population which
has made it more vulnerable? Nope. New York state, at 14.66%, ranks squarely toward
the youngish center of states (29/50 youngest)
while New York city is younger in general than the rest of the state with only 13% of the
city population older than 65.
Well maybe the mortality rate of COVID-19 just seems high
because it is about to peak? After all, the flu kills somewhere between 12 and 70
thousand people in the US every year and 290 to 650 thousand globally.
COVID-19 with an estimated number of deaths in the US of around 19 thousand and 104 thousand globally doesn’t seem that dangerous.
Yet, the very reasonable concern is that this disease is
just getting started. Wikipedia numbers suggest the total number of cases
globally is 1.7 million, which we know is a lower bound of the true number of
cases, as many of those who have COVID-19 have not been tested.
We don’t know how many people currently have COVID but we
can imagine a few different scenarios.
Scenario 1: COVID-19 reported cases are close to true cases
Let’s imagine that the number of people with COVID is approximately
the number that we have record of. There are some unreported cases but not that
many. If this is the case, we are in an extremely frightening world because so
far the disease has killed about 104 thousand people out of the 1.7 million it
has affected, a 6 % mortality rate and almost all of those infected are not yet
recovered, meaning some of them will die, increasing the observed mortality
rate. The small consolation under this scenario is that cases are largely
detected and therefore with enough government and individual intervention ongoing
transmission likely could be slowed and stopped through thorough and diligent contact
tracing.
Scenario 2: COVID-19 reported cases are reasonable fraction of true cases
Let’s imagine that the true number of cases is somewhere between 2 to
10 times as many as those reported. Under this scenario, the current mortality
rate is calculated by dividing the observed mortality rate by the factor of
unknown cases so 6/2=3% for 2 times with 6/10=0.6% for ten times. In this
scenario contact tracing by and large will fail as there is simply too many unknown
cases. The best thing governments and individuals can do in this scenario is
shut off potential avenues of transmission between individuals until either a
vaccine can be found or the number of new cases is so small that the implementation
of contact tracing is feasible. Sadly even in the scenario in which the true number
of cases is 10x that of the reported cases the mortality rate of COVID at a
minimum of 0.6% is still much higher than of the seasonal flu and if left
unchecked would result in 2.28 million fatalities in the US alone (0.6% * 380
million) which is greater than the top ten leading causes of death in the US
combined:
Table 1
Heart disease: 647,457
Cancer: 599,108
Accidents (unintentional
injuries): 169,936
Chronic lower respiratory
diseases: 160,201
Stroke (cerebrovascular
diseases): 146,383
Alzheimer’s disease: 121,404
Diabetes: 83,564
Influenza and pneumonia:
55,672
Nephritis, nephrotic
syndrome, and nephrosis: 50,633
Intentional self-harm
(suicide): 47,173
Total 2.08 million
Scenario 3: COVID is already everywhere and most people have it or have already had it
Strangely this is the best-case scenario. Under this scenario
only those who have severe outcomes from COVID-19 are being reported while the vast
majority (like 99%) of individuals are asymptomatic. Under this scenario,
shutting down state, national, and international travel and social activities is
futile for any extended period of time as the virus is already everywhere and
we just need to treat the severe cases that pop up the best we can and suck it
up. This scenario is appealing as it means the worst has already come or is
soon to.
So which scenario are we in?
Reviewing the scenarios it is impossible to know with certainty
in which scenario lies reality. However, does the evidence point against any
given scenario?
Scenario 1 seems unlikely to me due to the tens of thousands of
cases are popping up each day (Figure 5). This rate of new infections seems to indicate
that there is a sizable infected population which has not yet been detected and
has continued to spread the virus despite national, state, and local recommendations
and mandates intended to limit spread.
Under Scenario 3 in which COVID-19 is already everywhere
this scenario seems unlikely due to the lumpiness of the mortality numbers. If COVID-19 were everywhere
then we would expect people across all states and countries to be dying from the
disease more or less proportionately. If COVID-19 were already everywhere we
would expect that mortality numbers to be mostly homogenous across states. However, this
is not what we are seeing with highly heterogenous mortality numbers across states
and countries. New York currently has around 400 deaths per million while New
Jersey 218, Michigan 108, Florida has around 19, California 14, Texas 8, and Montana
6.
These numbers suggest that COVID is spreading from infected communities
to non-infected communities in a hotspot community spread model rather than
that of a widespread dispersal characteristic of Scenario 3.
But maybe one might ask, is it possible that deaths
previously assigned to other causes might have actually been caused by COVID-19
before the virus was known and publicized? Yes, there are very likely deaths caused
by COVID-19 which have not yet been correctly attributed to the disease. If accounted
for could, these deaths correct the heterogeneity in the data in order to place
us back in Scenario 3? Figure 1 shows the known deaths in New York by COVID-19 compared
with flu mortality numbers from 2014-2017.
Already, COVID-19 has or will soon double the mortality of the flu for these
years and unfortunately the number of infections has continued to grow at an
alarming pace (Figure 2).
Figure 1 |
Figure 1 |
So, while it is impossible to know, I believe it extremely
unlikely that a disease twice as deadly as a typical flu (at least in New York)
could go undetected in thousands of hospitals and laboratories across the United
States.
Assuming Scenario 2
With some reports saying 80% of cases are asymptomatic, an
estimate of 5x as many people infected with COVID-19 as what has been reported
might not be crazy. This would mean that the actual number of people infected with
COVID in New York is something like 650,000 which while encouraging in that 9,000
deaths out of 850,000 (1%) is much better than 9,000 deaths out of 170,000
people (5.3%).
The problem of course is that even the inflated number 850
thousand is only 10% of the city’s population and 4.6% of the total population of
the state. Meaning we still would have a vast large potential population to
infect. Combine that with the factor that we are having somewhere between 6,000
and 10,000 new cases pop up every day in the state despite a ‘stay are home’
order in effect for two and a half weeks.
Looking at the graph (Figure 2), the number of cases in New
York has grown very rapidly. Yet, presumably the number of cases would be even
greater if the lockdown order were not in effect.
Yet most of us don’t live in New York. How much should we be
worried?
As New York has an above average health care system and
relatively lower proportion at risk elderly population New York could be seen
as a lower risk state compared with many. Yet, New York City is also the most
dense city in the country with perhaps the highest use of public transportation
and correspondingly highest use of public potential infection points such as
grocery stores, theaters, restaurants, etc..
Looking at only states which have reported more than 5,000
cases and scaling counts by log10 we get Figure 3. In Figure 3 is it hard to
mark out much except that the overall shape of the infection curve seems to be
similar across states.
Figure 3: Total number of cases by state for states reporting at least 5,000 cases. |
It is difficult to make comparisons between states and to
make predictions from Figure 3. However, one technique often used to pick a
point in time with a certain number of cases then compare how growth rate in
cases changed for others states after they reached the same point. In this
case, I will pick my earliest date in my dataset March 18th in New
York in which there were around 2,500 cases of COVID-19 reported. This number
was reached later by different states, New Jersey on the 23rd,
California on the 25th, Washington on the 26th, Michigan,
Florida, and Illinois on the 27th, and so on.
Plotting cases starting at this common point now gives us a
means of comparing case growth by state (Figure 4). Under this technique, New
York definitely appears to have a higher growth rate followed by New Jersey
with Michigan, California, Louisiana, Massachusetts, Pennsylvania, Illinois, Texas,
Georgia, and many other states following a less aggressive but still positive
growth trajectory.
Figure 4: Day 0 is the first day a state passes 2,480 cases
of reported COVID-19.
|
Conclusions
COVID-19 appears to be really bad and New York has been
hit the hardest - so far. How bad? We won’t know until after crisis has passed. Fortunately, other states had lower
rates around the time the country (the President) started taking this crisis
seriously. Since then, those states appear to be on a more gradual growth
trajectory than that of New York.
Yet despite widespread concern over COVID-19, instructions,
and mandates to help reduce the spread, new infections are still on the
rise (Figure 5). And this is under conditions in which we have put a stop in-person social gathering, closed restaurants, and ordered residents to stay in
doors in many states. What happens when the public gets tired of such restrictions? It seems likely that growth rates of new infections would start rising
rapidly once again.
Figure 5 |
Francis, your scenario analysis is more considered than many I have seen by epidemiologists. It needs wide exposure.
ReplyDeleteOne suggestion: your figure 4 adjusts the data so that graphs can be visually compared. However, the comparison would be fairer if the starting point was a day in which the numbers of reported cases was expressed in terms of cases per million population, rather than the unadjusted number you use.
Thank you for making your analysis and code available.
Michael