Confronting the Data Dilemma

A few years ago, the City of Boston decided to tackle an age-old municipal problem in a new way. Using the motion-sensing capabilities of smart phones, volunteers who download Boston’s Street Bump app automatically send to the city information about the condition of the streets they’re driving on. When their cars hit a pothole—or a pothole-to-be—their phone sends the shock to a data hub, which combines the information from many other phones to pinpoint problem areas on the streets to be repaired.

In the early days of the program, Street Bump found something fascinating: there were more potholes reported in wealthy areas of the city than in poor ones.

Jake Porway of DataKind told this story at The City Resilient, a recent gathering of people from diverse fields working to develop communities more resilient to catastrophic change. The event was curated by PopTech and presented in partnership with the Rockefeller Foundation and the Brooklyn Academy of Music. My ears perked up at Jake’s presentation, since in recent years the Rita Allen Foundation has found that more and more of the innovative work we support in social change and in biomedical research involves understanding large amounts of data.

“Why?” Jake asked the audience gathered in BAM’s Harvey Theater. Why were there more potholes in rich areas? A few answers came from the crowd. Someone suggested different traffic patterns. Then the right answer came: wealthy people were far more likely to own smart phones and to use the Street Bump app. Where they drove, potholes were found; where they didn’t travel, potholes went unnoted.

This is a cautionary story, but not an alarming one. The City of Boston realized the problem of the skewed data and is addressing it. It reminds us, however, that for all of the power of new technology to capture vast amounts of data and to help us analyze its significance, we must always be alert to what the data might be misrepresenting, particularly for the most vulnerable.

The history of using data for social purposes in the United States is far from pristine.

Earlier in the day at The City Resilient, Dr. Mindy Thompson Fullilove of Columbia University had shown a harrowing set of slides. The first was a 1935 map of Pittsburgh with large sections in the middle colored in red and yellow. This was one of the maps of American cities that coded where an investment in a mortgage would be safe. Blue and green indicated a safe investment—yellow and red suggested extreme caution. The second slide was one of the forms completed by a surveyor that had determined the cautionary color of one of the sections. Neat handwriting followed the form’s typed categories. “Detrimental Influences: Negro encroachment threatening.” “Infiltration of: Jewish.” “Negro: Yes, 5%.”

Maps such as this came to guide loan decisions across the country. A bank loan officer didn’t need to know the basis for the colors on the map—only that loan applications in the red areas should be turned down. As Dr. Fullilove notes, lines of poverty and wealth in our cities still bear the imprint of redlining, as this practice came to be called, which was legal until 1968. (View the form and map in Dr. Fullilove’s new book, Urban Alchemy: Restoring Joy in America’s Sorted-Out Cities.)

Like many people today, I am deeply excited about the promise “big data” holds for recognizing patterns in society and to track the effect of changes. The Sunlight Foundation uncovers data about government and then helps people analyze and understand it. Code for America Fellows use data science and technology to help local governments solve civic problems—including improving pretrial justice systems in Louisville and New York City. Investigative journalists analyze data to find corruption and waste—like a recent Center for Public Integrity investigation into healthcare spending, which uncovered widespread questionable Medicare billing practices. The Rita Allen Foundation has supported each of these organizations, as well as biomedical scholars who use genetic data to make leaps forward in understanding and treating disease.

But along with all the promise, there is good reason to be apprehensive about the dangers of a data-driven society. There’s the potential for losses of control of personal information and a disconnect between the daily experiences of people and the algorithms trusted with making decisions that will affect them. The history of redlining gives a stark picture of how prejudice can gain power through actuarial tools. Even when race discrimination isn’t intended, the effects can still be discriminatory: think stop-and-frisk. Additional perils are many—including that people could be denied loans because of their connections on social networks, or because of other customers in the stores where they shop.

What’s to be done? The question of whether we should or shouldn’t rely on data as a society is moot. Collecting and understanding ever larger amounts of data is imbedded in today’s biology, security, marketing, finance, engineering, and countless other areas.

Some ideas for how to mitigate the dangers of data came at another recent gathering, Ashoka’s Future Forum. In a panel on data and social change, Jeff Edmondson of the Strive Network, Alexandra Bernadotte from Beyond 12, Sascha Meinrath of the Open Technology Institute and John Wilbanks from Sage Bionetworks had a robust discussion about the promise and dangers of data in society. It’s a topic that has been fueling important debates and advocacy—and ought to be fuelling more. Here are a few of the key principles that emerged from the panel that have particular importance for social-change efforts: Data and analysis techniques should be open for public scrutiny; people from all areas of society should be aware of what data is being collected from them and how it will be used; some uses of data should be legally prohibited; and, in collecting and using data, we should strive for constant improvement, replacing problematic data with better data.

I’d add to this list that data alone should never be used to make social decisions. It should always be viewed side-by-side with the stories of individual people and communities.

Let’s go back to loans in American cities. Redlining may now be illegal, but it’s still extremely difficult for many small businesses and aspiring entrepreneurs to get loans from major banks. A short credit history can make getting a conventional loan nearly impossible, and the advent of big data may only make it more difficult for some people. Here’s where philanthropy can play a role—and not just the philanthropy of a few. Since 2005, Kiva has worked internationally to connect entrepreneurs with a new source of capital: crowdfunded loans given by visitors to Kiva’s site, who might contribute as little as $25 to help a farmer in India or a craftswoman in Uganda. Now Kiva’s loans are expanding in U.S. cities as well.

Kiva City Newark, which launched last week with start-up funding from the Rita Allen Foundation, is raising funds for the Artisan Collective on Halsey Street to refurbish their store and develop their customer base, Diego to purchase a van for his export business, and Maria to buy inventory and create an online presence for her hair salon (loans to these and select other borrowers are currently being matched by the Rita Allen Foundation). In Newark, Kiva identifies borrowers by working with the Intersect Fund—a nonprofit that provides training and microloans to small businesses in New Jersey—as well as with Newark’s Kiva Trustees—people with deep ties to the community. Borrowers are not selected on the basis of a single set of numbers, though financial responsibility is taken into account. The biggest factor, instead, is a borrower’s character and particular story. I urge you to take a look at the Kiva site to see if there’s a story that you connect with and would like to support.

And the result? Initially, Kiva borrowers often have low or no credit scores. After more than $450 million in loans around the world, given to more than 1 million borrowers, Kiva currently has a repayment rate of 99 percent. And as of last Thursday, 18 loans worth $80,250 had already been fully funded as part of Kiva City Newark. Now those are some good numbers.