When classifying cases for all but trivial problems -- for example, classifying a student as failing, an armed conflict as war, or a recipe as Italian, Thai, or French cuisine -- we need to choose a tradeoff between true positives and false positives. This is even true in criminal justice, where we would like to classify persons as criminals or law-abiding citizens.
One way to measure the justness of a criminal-justice system is to look at "n guilty men": how many guilty persons that system lets free for every innocent person it punishes. North Korea cares far more about capturing true positives than about not capturing false positives, so they will arrest and punish people even when there isn't much evidence of wrongdoing. In contrast, democracies tend to err on the side of the accused, letting more guilty men go free to avoid incorrect imprisonment. Reasonable people can disagree about what n should be, but few would argue that North Korea's n, which is surely below 1, is more just than a democratic n, which is often claimed to be 1 or greater.
It turns out that getting a large n is often difficult. The table below tries to make this point using a hypothetical terrorist-identification program. Given 10,000 "terrorists" (probably more than there are in the US) in a population of 300,000,000 (smaller than the US).
Whether the program uses human experts or a sophisticated data-mining algorithm doesn't matter: even when we can correctly classify the vast majority of terrorists and civilians, n will be small.
|Terrorists in the US||1,000||
|% Terrorists Flagged||99.9||
|% Civilians Flagged||99.99||
|% Flagged Who Are Civilians||75.0||
|"n Guilty Men"||.0003||
|True Terrorist||True Civilian||
Ayesha Mahmoud, Kenny Joseph, and I wrote about the trip and how it affects our work. The post begins:
Police departments around the country have been in the spotlight recently because of several controversial, high-profile incidents. Tragic events in Ferguson, New York City, Baltimore, and elsewhere have highlighted the need for police departments to better address the issue of adverse interactions between the police and the public. Many police departments are working hard to avoid these negative interactions with new technologies and tactics, while others are leading new data collection efforts.
This summer, as part of the White House Police Data Initiative, fellows Sam Carton, Kenny Joseph, Ayesha Mahmud, and Youngsoo Park, technical mentor Joe Walsh, and project manager Lauren Haynes are working with the Charlotte-Mecklenburg Police Department (CMPD) on a novel approach: using data science to improve the department’s Early Intervention System (EIS) for flagging officers who may be at a high risk for being involved in an adverse interaction.
Read more here.
Eugenia Giraudy and I wrote a blog post introducing our Data Science for Social Good project:
In 2005, Florida implemented a new “Stand Your Ground” law, which legally protected the use of deadly force in self-defense. The law, which removes the “duty to retreat” when a person is threatened with serious bodily harm, gained national attention after George Zimmerman fatally shot Trayvon Martin in 2012.
Soon after its passage in Florida, Stand Your Ground laws went “viral,” spreading to other parts of the country. Currently, at least two dozen states have implemented a version of Florida’s legislation. These laws didn’t arise in response to broad, spontaneous popular demand. Interest groups, in particular the National Rifle Association and the American Legislative Exchange Council (ALEC), drafted a model bill to ease passage across the country. Ten states have passed nearly identical bills to the ones Florida used and ALEC promoted.
Read more here.
My Data Science for Social Good earmarks team wrote a post for the Sunlight Foundation about difficulties encountered in trying to find congressional earmarks. It begins:
Last week, President Obama kicked off the fiscal year 2016 budget cycle by unveiling his $3.99 trillionbudget proposal. Congress has the next eight months to write the final version, leaving plenty of time for individual senators and representatives, state and local governments, corporate lobbyists, bureaucrats, citizens groups, think tanks and other political groups to prod and cajole for changes. The final bill will differ from Obama’s draft in major and minor ways, and it won’t always be clear how those changes came about. Congress will reveal many of its budget decisions after voting on the budget, if at all.
To continue reading, click here.
There are at least three reasons bad weather causes less traffic enforcement:
- There are fewer police available for traffic enforcement because they are busy responding to traffic accidents.
- Speeding is the easiest violation to ticket, and speeding is less common when there's precipitation.
- Officers generally have discretion to apprehend and ticket traffic violators, so they can often choose whether to stop dangerous drivers. They'd also prefer to stay in their warm, dry patrol cars. Therefore, all else being equal, they will stop fewer vehicles when it's cold or wet.
Although police are less likely to enforce the law in bad weather, they are also more likely to issue a ticket (and that ticket will be more expensive) during a stop for a couple reasons:
- Cops issue more tickets for serious violations, and it takes a more serious violation to draw a cop out of his car in bad weather.
- Getting a cop out of his car in bad weather makes him worse off, and cops prefer to punish anyone who makes their lives more difficult.
In summary, the police will probably leave you alone when it's nasty outside, but if they don't they're bringing their ticket book with them.
I am serving as a mentor for the Eric and Wendy Schmidt Data Science for Social Good program this year. Madian Khabsa (one of my fellows) and I wrote about our Congressional-earmarks project for the DSSG blog. Here's the beginning of the article:
Earmarks have been called “the best known, most notorious, and most misunderstood aspect of the congressional budgetary process.” These government budget items allocated to specific people, places, or projects are alternately described as a subversion of democracy or an important negotiation tool to smooth the passage of controversial legislation. But despite the attention earmarks attract, they remain extremely tedious and time-consuming to identify in federal bills and reports that may be hundreds of pages long.
This summer, Data Science for Social Good fellows Matthew Heston, Madian Khabsa, Vrushank Vora, and Ellery Wulczyn and mentor Joe Walsh, working with Christopher Berry at the Harris School of Public Policy, will help shine a light on earmarks, building computational tools to automatically identify them in Congressional texts.
You can read more here.
The Washington Post's political-science blog, The Monkey Cage, briefly covered my research conducted with The University of Alabama's Greg Austin on Department of Defense media behavior (article). Here's the abstract:
Are political actors more likely to release bad news when it is least likely to be noticed? Former government and administration spokespersons claim they chose when to release information harmful to their cause when they were on the job (see Norris 2005); there are numerous anecdotes of negative news stories being released late on Friday (see Theimer 2009); and an episode of The West Wing suggests that the politicians try to release lots of bad news together on Friday, an act the fictional White House deputy chief of staff calls “taking out the trash.” Despite these popular accounts, there has been little systematic investigation of strategically timed news dumps. In this paper we look at the empirical record of Take out the Trash Day. We begin by outlining and building on the reasons that political actors would strategically release information, taking into account the mediating role that technology may play (Lee 2005). We then correlate the positivity and negativity of more than 12,500 Department of Defense news releases from October 1994 through February 2013 to their release days. We find mixed evidence for the hypothesis.
This article is part of a larger research agenda on taking out the trash. I have also found that the president holds most Medal of Honor ceremonies between Monday and Thursday and that federal courts have overturned a disproportionate number of same-sex marriage bans on Friday. I will blog more about those results as I refine them.