Scoping Projects

We wrote about what a good Data Science for Social Good project looks like. Our post begins:

Data Science for Social Good is a summer program that requires year-round preparation. A successful summer requires a mix of good people and projects, and we spend a lot of time trying to find projects to solve and the people to solve them. In addition to reading over 800 applications from aspiring fellows, mentors, and project managers, we’ve spent numerous hours researching, pursuing, and scoping projects: exploring datasets, speaking with representatives, and wrangling with attorneys. Well over a hundred projects will cross our emails, phones, and eyes before we find the 12 to do next summer.

Read more here.

Visiting the Charlotte-Mecklenburg Police Department

Ayesha Mahmoud, Kenny Joseph, and I wrote about the trip and how it affects our work. The post begins:

Police departments around the country have been in the spotlight recently because of several controversial, high-profile incidents. Tragic events in Ferguson, New York City, Baltimore, and elsewhere have highlighted the need for police departments to better address the issue of adverse interactions between the police and the public. Many police departments are working hard to avoid these negative interactions with new technologies and tactics, while others are leading new data collection efforts.

This summer, as part of the White House Police Data Initiative, fellows Sam Carton, Kenny Joseph, Ayesha Mahmud, and Youngsoo Park, technical mentor Joe Walsh, and project manager Lauren Haynes are working with the Charlotte-Mecklenburg Police Department (CMPD) on a novel approach: using data science to improve the department’s Early Intervention System (EIS) for flagging officers who may be at a high risk for being involved in an adverse interaction.

Read more here.

Text Re-Use in Scott Walker's Abortion Bill

Eugenia Giraudy, Matt Burgess, Julian Katz-Samuels, and I wrote another blog post for Data Science for Social Good. It starts:

On Monday, Wisconsin governor and 2016 presidential candidate Scott Walker signed into law a bill banning non-emergency abortions past the 19th week of pregnancy. Unsurprisingly, Walker’s move garnered support from one side, derision from the other, and media attention from both. However, journalists face a big hurdle when trying to provide context for a story such as this: it is time-consuming to figure out how many states have introduced similar legislation and where it originated.

Automated detection of copied legislation can help. Data Science for Social Good fellows Matt Burgess, Eugenia Giraudy, and Julian Katz-Samuels, technical mentor Joe Walsh, and project manager Lauren Haynes are working with the Sunlight Foundation to make it easier to find re-used text. Using Sunlight’s corpus of state legislation, our computational tools uncover textual similarities.

Read more here.

Finding Legislative Plagiarism

Eugenia Giraudy and I wrote a blog post introducing our Data Science for Social Good project:

In 2005, Florida implemented a new “Stand Your Ground” law, which legally protected the use of deadly force in self-defense. The law, which removes the “duty to retreat” when a person is threatened with serious bodily harm, gained national attention after George Zimmerman fatally shot Trayvon Martin in 2012.

Soon after its passage in Florida, Stand Your Ground laws went “viral,” spreading to other parts of the country. Currently, at least two dozen states have implemented a version of Florida’s legislation. These laws didn’t arise in response to broad, spontaneous popular demand. Interest groups, in particular the National Rifle Association and the American Legislative Exchange Council (ALEC), drafted a model bill to ease passage across the country. Ten states have passed nearly identical bills to the ones Florida used and ALEC promoted.

Read more here.

OpenGov Voices: Bringing Transparency to Earmarks Buried in the Budget

My Data Science for Social Good earmarks team wrote a post for the Sunlight Foundation about difficulties encountered in trying to find congressional earmarks. It begins:

Last week, President Obama kicked off the fiscal year 2016 budget cycle by unveiling his $3.99 trillionbudget proposal. Congress has the next eight months to write the final version, leaving plenty of time for individual senators and representatives, state and local governments, corporate lobbyists, bureaucrats, citizens groups, think tanks and other political groups to prod and cajole for changes. The final bill will differ from Obama’s draft in major and minor ways, and it won’t always be clear how those changes came about. Congress will reveal many of its budget decisions after voting on the budget, if at all.

To continue reading, click here.

Who Attends Chicago Public Schools? A Breakdown by Race

While Chicagoans understand that white students are less likely to enroll in CPS schools than students of color, few seem to know how big the difference is.  Using Bayes' formula and publicly available data (located here, here, and here), I calculated the probability that a Chicago child of African American, Hispanic, Asian, white, and multi-racial descent attends the public schools.  Here are the results:

5-19 year olds in ChicagoPr(attends CPS)Pr(race)Pr(race | attends CPS)Pr(attends CPS | race)
Hispanic0.79710.4000.4520.90
African American0.79710.3900.3970.81
Asian0.79710.0360.0350.78
Multi-Racial0.79710.0170.0110.52
White0.79710.1550.0920.47

Pr(attends CPS) is the probability that a child (between 5 and 19 years old) in Chicago attends a public school.  Pr(race) is the probability that a randomly chosen child in Chicago belongs to a given race; for example, the probability that a Chicago child is white is 15.5%.  Pr(race | attends CPS) is the probability that a randomly chosen CPS student is a given race; for example, about 39.7% of CPS students are African American.

From these three inputs, I calculate Pr(attend CPS | race), the probability that a child of a given race attends a Chicago public school.  While nine in ten Hispanic children and eight in ten African American children in Chicago are enrolled in CPS, fewer than five in ten white children are.  An African American student is 1.7 times more likely to attend a CPS school than a white student.  An Hispanic student is almost twice as likely.

Shining a Light on Earmarks

I am serving as a mentor for the Eric and Wendy Schmidt Data Science for Social Good program this year. Madian Khabsa (one of my fellows) and I wrote about our Congressional-earmarks project for the DSSG blog.  Here's the beginning of the article:

Earmarks have been called “the best known, most notorious, and most misunderstood aspect of the congressional budgetary process.” These government budget items allocated to specific people, places, or projects are alternately described as a subversion of democracy or an important negotiation tool to smooth the passage of controversial legislation. But despite the attention earmarks attract, they remain extremely tedious and time-consuming to identify in federal bills and reports that may be hundreds of pages long.

This summer, Data Science for Social Good fellows Matthew Heston, Madian Khabsa, Vrushank Vora, and Ellery Wulczyn and mentor Joe Walsh, working with Christopher Berry at the Harris School of Public Policy, will help shine a light on earmarks, building computational tools to automatically identify them in Congressional texts.

You can read more here.