Data Isn’t Colorblind

Over the past few months, we’ve explored how to critically evaluate data sources and understand potential limitations of reports about data related to the coronavirus or the impacts of coronavirus on data collection. One very important concept that we talk about regularly in the office is that our data sources only reflect what is “reported and recorded”. This means that we are responsible for filling in the blanks and theorizing what might not be reported, what might not be recorded, and more importantly, why those data points could be missed and what impact their absence would have on our analysis and conclusions. 

It is important to recognize that any and all data is collected through a process which can have built-in biases, structural inequities, and other confounding factors, all of which impact the resulting data we use to make decisions in the world.  

We recognize that there are biases built into data collection processes and actively try to tell the stories that illuminate the grey areas. For example, collecting qualitative data helps highlight what might happen before and after a datapoint is collected. During our Turning the Corner project, the stories we collected from Southwest and North End business owners and residents humanized the data about two neighborhoods experiencing change as new residents moved in.   These interviews helped us to understand how an individual might experience different scenarios that create the same administrative data point.

One topic that received a variety of input was the experience with crime data. Our theory was that residents in neighborhoods vulnerable to significant change would experience lower crime rates, making it more attractive to investors. The stories we heard highlighted how different groups of residents were more or less likely to call the police depending on their background, their immigration status, their cultural practices, and more. In some cases this meant a group of residents were reporting much less crime through official channels than another group in the same neighborhood.

When we start thinking about how crime data is reported and recorded, we can see how there are many places for the data collection to be less accurate and even skewed in an undetectable manner by a neighborhood’s demographics. 

Process of Reporting Crime

After a crime is committed, someone has to call 911. There are a multitude of factors that impact whether someone calls the police. When the police respond, their response times vary by neighborhood, which can impact whether an arrest is made or a case closed. Then, the police report has to have the correct address and incident report, which can be varyingly accurate depending on how busy or not a neighborhood or day is or even just which officer is currently assigned. Crime data in Detroit then goes through processing to remove personally identifiable information de-identify it and make it ready to be published. 

This process can be applied to almost every dataset like building permits, another area we found interesting stories about. For example, some neighborhoods are more likely to trade services between neighbors and thus less likely to go through formal pathways for home renovations which would generate building permits. As neighborhoods attract new residents from other communities, the enforcement of building permits can change. As one resident put it during our interviews, new residents seem to think: “Oh hey, this is nice, but that ain’t up to code. I’m going to call the inspectors.” This doesn’t mean communities aren’t being invested in, but that those investments are not being reported or recorded.

The 2020 U.S. Census is underway and the Census Bureau has identified a variety of demographics that are harder to count, including people of color, immigrants, young children, multifamily home residents, renters, non-native English speakers, and more. If these populations aren’t counted completely, then the resources used to support them and even their fundamental representation in Congress is at risk.

Hard to Count Populations

Recognizing these shortcomings of data collection helps us to empower Detroit residents with data about their community while recognizing that their lived experiences aren’t always captured by a single datapoint. We are continuing to work with neighborhood organizations in order to understand the full picture, and encourage all members of our community to collaborate with our network by bringing data-driven questions to AskD3.

Copyright © 2022 Data Driven Detroit. All Rights Reserved.