Normalizing data, or setting the data up to make it easier to read and compare, can help ensure that comparisons of these raw numbers won’t favor highly populated communities like New York City or Detroit, or leave out lower populations altogether. One great way to normalize coronavirus case data within various communities is using per capita data.
Defining Per Capita Data
In the midst of the coronavirus, comparing data between places is really important for understanding the growth of infections, hardest hit areas, and resource allocation. Comparing raw numbers will usually fall short because it will skew the numbers towards areas with higher populations. Normalizing data, like creating a per capita rate of cases, can help ensure that communities with lower populations aren’t left out when comparing data to highly populated communities like New York City or Detroit.
Raw numbers can hide hot spots of infections in areas with lower populations because healthcare capacity and infrastructure are built on community needs. As this graphic shows, 100 cases in a small, rural community, represent a much larger portion of the population than 100 cases in a large, urban area. These communities will more than likely have significantly different tipping points in terms of overall case counts when you consider the systems built to support a community of 1,000 versus 100,000.
In the rural community, 100 cases could easily overwhelm the system completely. In the larger, urban community, 100 cases, probably divided between multiple hospitals, would be merely a blip on the system’s radar. When you consider the “per capita” count, the rural community is 100 cases per 1,000 people where the urban community is 1 case per 1,000 people.
Ideally, we would be able to know the exact number of beds normally available in these hospitals and then create ratios compared to coronavirus patients that need to be hospitalized, but this level of detail is difficult to collect with exactness, especially during the fast-moving world of a pandemic. While imperfect, a per capita number helps normalize cases for population-size capacity issues, like number of hospital beds.
Michigan’s Per Capita Data
For example, based on the state’s census of hospitals and American Community Survey population estimates, we can compare two emergency preparedness regions in Michigan. Region 1 includes Clinton, Eaton, Gratiot, Hillsdale, Ingham, Jackson, Lenawee, Livingston and Shiawassee counties. Region 2S includes Wayne, Monroe, and Washtenaw counties.
If we just focus on the overall number of cases, Region 2S will always look significantly worse off because it has about five times the number of cases. However, Region 2S also has twice the population of Region 1. By comparing per capita cases, the comparisons start to level off. Instead of the cases being 5.6 times worse, the case rate is closer to 2.6 times, giving us a better understanding of what that state faces in more rural areas.
We can use this per 100,000 resident rate to examine hospital beds availability too. Region 2S has almost three times the number of hospital beds. It looks like, at first glance, Region 2S could deal with many more cases, almost five times the number. However, when we normalize the number of beds, we see that Region 2S has a higher rate of available, but it’s not as large as the original difference might look.
Understanding Per Capita Data at a Local Level
If we break down the analysis into counties, the differences in comparison are even more apparent, which also highlights how important local data is for strategic responses to the coronavirus. Early on, we noticed that many tools were comparing raw numbers and created a per capita county-level map for the whole country. Here you can see how normalizing to population size (in this map cases per million), changes the hotspots of cases.
In the first map, where overall case numbers are displayed, our eye is immediately drawn to Metro Detroit. Initially, this makes sense as the number of cases in the tri-county area far outstrips the rest of the state. However, remember that the population sizes vary widely between counties (Keweenaw County has 2,136 people). So, when we change the rate to per capita, more counties start to become part of the ‘hot spot’.
The per capita data can level the playing field between communities with significantly different populations. We might notice higher rates of cases in northern Michigan like Otsego County where 77 cases is actually 311 per 100,000 residents and Genesee County where 1298 cases is 316 per 100,000. These per capita rates rates are still lower than Macomb County (516 cases per 100,000) or Wayne County (812 cases per 100,000), but it is significantly closer than comparing raw case numbers between Otsego County (77 cases) and Wayne County (14,255).
As we noted in our Pandemic Data Consumption Guide, case data could tell us a few things. This might show us that Hillsdale and Otsego Counties have been hit overall harder by coronavirus. However, without knowing details about the local areas and potentially understanding each county’s given limitations in testing, it could also reflect better access to testing in general in these counties.
In Detroit, the city started reporting zip code level data. While includes per capita data, many news outlets reported on the overall case counts. In Detroit, zip code 48235 has both the highest number of cases overall and a high rate per 100,000 residents (data accessed 4/21/2020). However, the zip code with the highest rate of cases is 48207, a zip code with about 308 cases overall. Downtown Detroit made some noise because the zip code only had about 42 cases, but considering the downtown area’s smaller population, their case rate is much higher than one might expect, 921 cases per 100,000. The per capita data is an important metric to keep track of as we continue to respond to the data.