The Census Data Journalism Toolbox

By Benjamin Livingston and Bernardo Lopez Vicencio, NewsCounts

Introduction

As part of NewsCounts' research supporting journalists covering the 2020 United States Census, the team's data scientists have developed a simple data toolbox for newsrooms.

This toolbox allows journalists to develop datasets and visualizations that will help them and their audiences easily understand the importance of the census in their local areas, and measure the success said areas have had getting their citizens counted.

These tools were developed by Benjamin Livingston ("BWL") and Bernardo Lopez Vicencio ("BLV"). Each tool's author is noted next to its title.

Feel free to reach out to either of us with any questions you might have - we'd be overjoyed to help you get started with these tools and tailor them to your newsroom.

Benjamin Livingston can be reached at benjamin.livingston@columbia.edu.

Bernardo Lopez Vicencio can be reached at bl2786@columbia.edu.

1) Daily Census Response Rates API (BLV)

Interactive Dashboard

API Documentation

GitHub Repository

We built our own API that improves upon the US Census Bureau's response rates API, making it easier to retrieve daily response rate data from the 2020 census. The census provides only the latest response numbers, while we provide daily numbers going back months, allowing you to track how your area's response rate has changed over time.

Using this intuitive tool, you can create tables and graphs of response rates over time at the state, county or tract level. It also allows you easily download your local data as a CSV file.

responserates.jpg

2) The Demographics Of Census Nonresponses In Your Region (BWL)

Guide with Examples

GitHub Repository

"Undercounted" is a popular buzzphrase as the 2020 US Census takes shape - but what does it mean for your region?

We built a highly-customizable workflow that will allow you to quantify this like never before, explaining which demographic factors are correlating most with low response rates in your region during the current census. You can also track how these demographic trends appear to be changing (or not changing) from 2010.

The guide even includes a link to a Google Colab document that allows you to re-run the code for your local area easily.

This is an extremely powerful template that enables you to seamlessly tell a story about demographics and census response rates in your region.

illinois.png

3) Enumeration Disparities Caused By Suspension Of Census Field Operations (BWL)

Guide with State-by-State Breakdown

GitHub Repository

The suspension of US Census Bureau field operations due to the COVID-19 pandemic has created vast disparities in census counts between rural and non-rural areas, and have put non-rural areas at risk of extremely low response rates.

We thoroughly detailed how wide these gaps have become by visualizing them both at the state level and the national level.

This analysis is vital for helping these rural areas understand just how crucial responding to the census will be for them as field operations begin again.

enumeration2.png responserates.png

4) Internet Availability & Response Rates (BLV)

GitHub Repository

We looked into the relationship between county-level internet availability and census response rates. Due to the COVID-19 pandemic and the introduction of an online response option, internet availability is more crucial to the census than it ever has been before.

Using data from the American Community Survey and the 2020 census, we computed the correlation between the proportion of a region's population with internet connection and its census response rates, and shared our results.

internet%20response%20rates.png

5) Using Congressional Apportionment Data In Census News Stories (BWL)

Guide with Examples

GitHub Repository

One of the census' founding purposes is determining how many congressional seats each state will receive over the next decade.

This raises the question - how close did each state come to gaining or losing an extra congressional seat in 2010, and how close is each to gaining or losing an extra seat in 2020?

We provided an extensive analysis that tracked the apportionment consequences of undercounts (or lack thereof) in 2010 and 2020, and looked at how they can be illustrated in news stories.

2020_seats.png

6) 2020 Census Twitter Activity (BLV)

GitHub Repository

This repository contains a group of notebooks that explore tweets about the census.

We compared the 2020 census with the 2010 census to see how the conversation has changed over time. We also found the most active users, hashtags, and URLs that have been shared in these tweets. We looked into different topics of interest in the census conversation.

This code could easily be adapted to search for other terms.

census%20tweets.png

7) Statutory Population Thresholds Determined By The Census (BLV)

Guide with Examples

GitHub Repository

The information obtained from the 2020 census will help determine funding distribution, grant assignments, and the general organization of towns and cities.

We searched each state's law codes for population thresholds that determine whether a city receives certain funding or certain rights based on census counts. To do so, we used the FindLaw website to search for such thresholds. We scraped the site, saved the results into a CSV, then searched that CSV file for the most useful codes.