Blog Archive

Archives by Month:

June 2016

Losing Confidence in our Mid-Sized Community Estimates: Census Bureau Discontinues 3-Year American Community Survey

No matter who you are, you have probably encountered the Census Bureau at some point. It’s likely that you even responded to the Census by providing information on the number of people living in your household and a few other minor details. The first Census took place in 1790, when the Director was none other than Thomas Jefferson. So important is the Census, that it was constitutionally mandated to be taken every ten years. Most people think of the Census Bureau through this Decennial Survey which used to include a short form (a handful of questions about the number of people living in your household, etc.) and the long form (larger set of questions about your demographic and socio-economic background). After the 2000 Census, the Census Bureau decided that 10-year estimates were not timely enough, which led them to fully implement the American Community Survey (ACS) in 2005.

The ACS has provided estimates that are consistent with the long form, but on a much more frequent basis. Every year the Census Bureau surveys a little more than 1% of the US population. 1% sounds pretty small right? Actually, we are talking about 1% of roughly 323 million people (You can check the US and World population clock here). So, The Census Bureau is surveying about 3.5 million United States housing units every single year. This is great news, because the ACS is used for all sorts of decision making. Federal agencies use the ACS to allocate federal funding. The Census Bureau estimates that the ACS is used to allocate about $400 million per year. Many private-sector organizations use ACS for marketing their products (likely generating much more than $400 million in revenue). Researchers use the ACS to evaluate the impact of governmental policies at all levels of government. While surveying 3.5 million housing units per year is great for many purposes, at certain geographical scales there are too few people being surveyed to fully understand the impacts of governmental or private-sector decisions. It might be too large a stretch to make reference to the 3-year estimates as being the Goldilocks of ACS data, but the next several sections outline why the 3-year estimates are quite important.

Size of Government

To provide for more accurate data (smaller confidence intervals if you want to nerd out), the United States Census Bureau takes 1-year, 3-year, and 5-year averages of the ACS survey. An easy way to think about this is that a 1-year average is great if you want to make estimates for governments with 65,000 or more people. Okay, so this could be everything from Southfield (73,001 in 2014) to Detroit (680,281 in 2014). A 5-year estimate is great if you want to make inferences about smaller cities with a population of 20,000 or fewer. This could include such Michigan governments as small as Alpha Village(83 in 2014) or as large as Ypsilanti (19,844 in 2014). The 3-year estimates are perfect for medium-sized communities with populations between 20,000 and 65,000. Examples of this size government include Birmingham city (20,368 in 2013) and Taylor city (62,232 in 2013). It should also be noted that many counties across the United States fall precisely within this population range.


While thinking about the appropriate size of the community is the most obvious way to consider the benefits of the 3-year and 5-year estimates, one should also consider the time frame. So, assuming you weren’t living under a rock for a few years, you probably heard (or felt economically) the great recession that hit the United States and Detroit particularly hard. If we want to understand what happened in this period relative to the pre- and post- recession among various communities, the ACS is a great source. However, if we use the 5-year ACS (2007-2011; 2008-2012; 2009-2013) we combine vastly different periods. As noted by RLS Demographics, if we have 3-year estimates we could compare the pre- recession (2005-2007) to the recession (2008-2010), and compare both to the post-recession period (2010-2013). This type of comparison is especially important if we are adopting policies to ameliorate the effects of the recession, and also want to know if they worked. The point is that we need not think about these 3-year estimates as just being appropriate to different-size cities. They are very important when considering the inclusion of different time points.

Subpopulation Analysis

As researchers studying subpopulations (by gender, race, ethnicity, age, etc.), or politicians making decisions that will impact small communities, within small- to mid-sized geographies, the 1-year ACS will provide extremely large confidence intervals. In other words, estimates become unreliable. It’s like me telling you that there is a 51% chance I will make a basket- but I could be off by 25% in either direction. Therefore, the real likelihood of me making the shot will fall somewhere between 26% and 76%. Still, your odds, on average, of betting on me in this fictitious scenario are better than you will get at Motor City Casino, but understand that a good proportion of the time you will be wrong! Comparing Table 1 and 2 explains the importance of the three year estimates on subpopulations.


Table 1: Comparing Large and Small Counties Total Population (1- and 3-year estimates)


Screen Shot 2016-06-21 at 10.16.28 AM


When we look at Table 1 and compare the labor force population for the 1-year and 3-year estimates between a large county (Wayne County) and a medium-sized county (Marquette County), there seems to be little difference in the estimates. This could suggest that we are fine with simply using the 1-year estimates. However, the real issue comes in when we analyze subpopulations (Table 2).


Table 2: Comparing Large and Small Counties Subpopulation (1- and 3-year estimates)

Screen Shot 2016-06-21 at 10.17.23 AM


Table 2 makes clearer why the 3-year estimates are important when considering what happens to subpopulations. We have a significantly different sense of the labor market in 2012 when comparing the 1-year and 3-year ACS in Marquette County. Even the poverty levels in a large county like Wayne are off by 3.5%.

Great, now that we all agree (implicit because you are still reading) that we need the 1-year ACS estimate for very timely results with larger populations; the 3-year estimates for moderate-sized geographies, reasonable time-frames, and for more precise subpopulation results; and the 5-year estimates for much smaller geographies and more refined subpopulation analyses, I am here to let you know we have a PROBLEM! In 2015, the United States Congress cut the Census Bureau’s budget by $2.4 million. As a result of this cut, Census discontinued the 3-year ACS estimates.

So Why Discontinue the 3-year ACS?

It’s certainly not entirely the fault of the Census Bureau; the political attack on the ACS could be fodder for several blog posts. The irony in these attacks runs deep once one realizes that the efforts to cut the ACS have been levied by members of Congress who rely indirectly and often directly on the data for their own personal and professional uses.

Well, the most obvious answer is that it will help with the budget shortfall (here is a copy of the FY2016 budget proposal). One justification that has been given by the Bureau is that the 3-year estimates were only meant to be temporary. However, as the RLS Demographics group makes clear, if this is true it was never mentioned to the communities that have become dependent on these estimates.

A Parting Thought and a Revealing Graphic

It’s true that losing the 3-year estimates is not nearly as bad as when several members of Congress were calling for the discontinuation of the ACS in general, which would have effectively left us with no precise estimates of the current demographic and socio-economic standing of communities across the United States. Arguably, losing the 1-year or 5-year estimates would be significantly worse. Thus, while we try to see the glass as half full (by averaging the glass when it is about 1/5th full and when it is about 4/5ths full), losing the 3-year ACS does really limit our ability to derive reliable estimates for mid-size cities, subpopulations, and within useful and reasonable time-frames.


What To Do Now?

Best case scenario… Run for Congress. Win the seat. Become a ranking member and be placed on the Appropriations Committee. Pass a bill with broad bi-partisan support in both chambers that is quickly signed by the President that reinstitutes the 3-year estimates. Ensure that the Census Bureau has ample resources to carry this out in a timely manner.

Worst case scenario… Contact the Census Bureau and your member of Congress and let them know that you think the 3-year estimates are important. Then, share this blog post with your friends, family, colleagues, neighbors, and anyone who will listen through every medium possible (snail mail with the website URL at the head of the letter, call someone’s pager from a landline and let them know to check out the blog post, send the link by email, post the link on Facebook, write a short Twitter feed, or take a picture of the site and post it on Instagram). Whatever the case, spread the word!

An Analysis of the 100 Worst US Metropolitan Areas to Live with Spring Allergies

It’s that time of year. The birds are headed back north and the sun is shining brighter than ever. Even when it rains, a lingering smell of life breeds excitement for the months to come. Each drop grows flowers, brightens grass, and brings green life back to the trees. All of these things concerting in harmony are nature’s way of reminding us that the world is not dead, willing us to finally come out from under our blankets of crippling seasonal depression.

Unfortunately, for anyone who suffers from seasonal allergies, those same beautiful reminders that spring has sprung take on a very different connotation and create a sordid relationship with their arrival. The Asthma and Allergy Foundation of America (AAFA) estimates that more than 50 million Americans are living with seasonal nasal allergies (allergic rhinitis). This prompted an annual study conducted by the AAFA to determine the 100 worst metropolitan areas to live for spring allergy sufferers using three metrics to quantify “suffering”:

  • Because seasonal allergies in the spring are most commonly caused by pollen and mold spores, the AAFA used the pollen score for each metropolitan area that is created by the American Academy of Allergy, Asthma and Immunology (AAAAI).
  • Allergy medication purchases in each metropolitan area
  • The availability of board-certified allergists in each metropolitan area

An overall score was calculated using these three scores (at the risk of being too kitschy, I refer to this score as a suffer score). It was then used to compare all metropolitan areas, creating a list of the 100 worst nationwide.

When I stumbled upon this information, I thought it might be interesting to plot these metropolitan areas on a map of the US and compare them to one another (and it was!). With the help of some light analysis already conducted by the AAFA, I decided to create two different maps to showcase a few aspects of the dataset they created. Here’s the result!

Screen Shot 2016-06-15 at 11.12.02 AMFigure 1: 100 Worst Metropolitan Areas to Live with Spring Allergies, 2016

This map was fun to create. Each point on the map represents a different metropolitan area. The color of each point indicates whether a metropolitan area’s suffer score was above average, average, or below average when compared to all 100 metropolitan areas. Detroit and Grand Rapids are the only two Metropolitan Areas in Michigan that made the list. Although Detroit’s ranking increased 11 positions and Grand Rapids increased 3 since 2015, they both still rank near average on a national scale.

As we can see, many of the metro areas with an above-average suffer score are primarily located in the south moving north into the northeast. This makes sense considering the parameters used to define this dataset. Several of the worse-than-average suffer score areas lie within regions with high pollen scores and a climate that strongly supports mold spore growth.  HERE, you can find an NPR interview with Dr. Estelle Levetin from 2010 wherein she elaborates in more detail why that makes sense.

It’s important to keep in mind that the context of a suffer score is somewhat counter-intuitive: i.e., a below-average suffer score is a good thing (or at least better than an above-average suffer score)! Also, in order to achieve the final outcome of both maps, I converted a polygon shapefile of US metropolitan areas into a point shapefile. Because of this, some points appear to be slightly misplaced. This occurred because each point is actually representing a larger, abnormally-shaped area.

Figure 2: 100 Worst Metropolitan Areas to Live with Spring Allergies, by Region, 2016Screen Shot 2016-06-15 at 11.12.45 AM

This map offers insight into the ranking of each Metropolitan Area’s suffer score compared to its region. Each point on the map represents a Metropolitan Area and the color of the point represents a hierarchy of suffer score rankings. This hierarchy is then applied to each of the four US regions. The results are as follows:

South: This region holds 38 of the 100 metropolitan areas (38.0%). Of those 38, 10 have a suffer score that is worse than average (26.3%).

West: This region holds 23 of the 100 metropolitan areas (23.0%). Of those 23, none of them have a suffer score that is worse than average (0%).

Midwest: This region holds 21 of the 100 metropolitan areas (21.0%). Of those 21, 5 have a suffer score that is worse than average (23.8%).

Northeast: This region holds 18 of the 100 metropolitan areas (18.0%). Of those 18, 3 have a suffer score that is worse than average (16.7%).

It became apparent the moment I stumbled upon this dataset that there were several interesting things I could do with it. By the time my mind stopped racing in all directions, I had landed on five different analyses I wanted to depict. Unfortunately, I also had no concept of how long each one would take. This resulted in two maps with the potential of being completed before it was no longer spring and three maps of great promise, but no realistic life expectancy. Sad as it were, I am going to explain the concept of my favorite map that never came to be in my final section titled…

Figure 3 (Not Pictured). The Map That Got Away

The concept of this map was built upon a comparison between each metropolitan area’s overall rankings from 2015 to 2016. At first, I didn’t expect much difference in a one-year timeframe. Five years? Maybe. After a quick calculation, there were some serious trends that formed. My map was going to showcase the five areas whose conditions improved the most and the five areas where conditions declined the most. Even in just those 10 areas, there was a pattern that began to form. Areas of improvement trended in the south and areas of decline trended in the eastern Midwest and western Northeast. It would have been cool to visualize this comparison when applied to all 100 areas. Luckily, the AAFA conducts a similar study in the fall giving me ample time to prepare. I am looking forward to working more with similar data sets!