At BuzzFeed, data scientists have the opportunity to play with public-facing data.
In this post, we study data aggregated in 2014 by ProPublica, an independent non-profit investigative newsroom. The visualization below uses data spanning the 2011-12 school year, and shows seclusions/restraints for disabled and non-disabled students across the nation.
Yet many in the U.S. remain unaware that these events even exist.
Click on the map below to explore your state and school.
* "Physical restraints" refer to physical holds in which a student's ability to move their head, torso, arms or legs are limited.
* "Mechanical restraints" use something artificial like straps, handcuffs or bungee cords to do the restraining.
* Finally, "seclusions" refer to situations in which a student is confined against their will in a room they are prevented from leaving.
This is the first time that the federal government (specifically, the Office of Civil Rights and the U.S. Department of Education) has attempted to collect data from all schools.
Last year, ProPublica used the data to co-publish a piece with NPR. They found that "restraining and secluding [public school] students for any reason remain[ed] perfectly legal under federal law."
Attorney Jessica Bulter also created an aggregate-level graphic for ProPublica, which provides users with an excellent national-level view.
But the data could be a whole lot better.
Percentages for most schools are small. This could be because the self-reporting mechanism discourages schools from sharing if they are using seclusions or restraints. And often, administrators and teachers use such practices as a way to save students against self-harm, rather than "punish" them.
In fact, over 85% of each column has 0's, making it difficult to gather significant insights from what the statistics actually mean. The national averages for all 95,635 schools, shown in the table below, are nearly meaningless with such sparse data.
Furthermore, the reliability of non-zero data remains questionable.
Even after removing the outliers, the data looks suspiciously incorrect.
Some schools, like Birch Elementary School in Idaho, have the exact same percentages (0.348%) across all categories for disabled students.
Others, like East Central Junior High School in Oklahoma, have a duplicated set of data for disabled AND non-disabled students, suggesting a copy & paste error.
As a result, even methods like data imputation become infeasible.
If the data is incorrect, should we even care?
Often, isolating a student enables uninterrupted learning for classmates and is beneficial for the student herself. ProPublica's piece takes the opposite stance, demonstrating through first-hand experiences the ways in which restraints can get completely out of hand.
Yet despite the opposing opinions, sharing such data is a way to start a necessary conversation around schools' transparency on handling students.
Here's how you can help get more accurate data.
You can find all code for the visualization in this article here.
Update 6/8/2015 10:37 am EST. This article has been updated to use more neutral wording, in keeping with the point of the piece.