with data comes responsibility
Cathy O'Neil's blog, mathbabe, has long been one of my favorites. She's a mathematician by training and her focus on data science speaks to the "empirical revolution" in economics.
Her book, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, came out last week and I highly recommend it to economists working with Big (or small) Data. She discusses several examples of opaque, large-scale models which have proved costly and unfair to some individuals. (See also several reviews here.) Think of it as tales from the dark side of Big Data.
And with economists flocking to work with Big Data, hers is a timely read.
O'Neil is careful not to point a finger at either "Big Data" or "algorithms" in general but rather she is critical of how some models are designed and used, of the objectives set by people. I agree. It is a non-starter to blame our tools. Computers are stupid; they do exactly what we tell them to do. Excel spreadsheets don't make mistakes. I do. Like when I accidentally overwrote a number in a cell a few weeks ago. And that's not just in Excel. I have told computers to do stupid things with data in SAS, Visual Basic, Stata, Rats, R, etc. ... See a pattern? It's me. It's us. Not our algorithms. And yet, people are also the ones who set the goals, care enough to fix mistakes, and think about what it all means.
I see O'Neil's book as a useful counterweight to the positive news from economists on Big Data, machine learning, and the consumer benefits from data-driven algorithms. Levitt's analysis of Uber is just one recent example. And while his results may not be all that novel, the richness of the Big Data does open up a lot of potential to study behavior. But with this potential comes responsibility.
O'Neil's fundamental concern is the balance between efficiency and fairness. Chapter 8 on lending standards shows how complex the balancing act can be and how algorithms can serve either objective. FICO credit scores were a step forward in both efficiency and fairness in lending, since they tied creditworthiness to an individual's own credit-related behavior, not to race or gender or peers. The success of this risk-based pricing spawned other unregulated metrics of creditworthiness, referred to as e-scores. Unlike FICO scores, e-scores are not transparent and use proxies, such as zip codes or behavioral patterns of peers, rather than just individual behavior. On average, e-scores may also work well in predicting default and assessing risk, even above and beyond FICO scores, but they also have a tendency to embed pre-existing inequities, create negative feedback loops, and enable predatory lending. The issues O'Neil raises about efficiency and fairness are not new and certainly come up in other writing. Even so, with economists helping more and more to design such models ... and drawing inferences about people from Big Data, I think her concerns are worth thinking hard about.
This bit near the end of her book captures the big picture well:
"Big Data processes codify the past. They do not invent the future. Doing that requires moral imagination, and that's something only humans can provide. We have to explicitly embed better values in our algorithms, creating Big Data models that follow our ethical lead. Sometimes that will mean putting fairness ahead of profit." (p. 204)
Some, maybe even many economists, might push back on the idea that our empirical models embed values, let alone that we should strive to "embed better values." Yet, setting up the objective function, which we do in almost every economic model, is nothing more than defining a value system. I am lot more comfortable debugging my code (and keeps me plenty busy) than thinking about the ethical implications of my empirical analysis. Still I think O'Neil makes a compelling case that we ought to wrestle more with these deeper issues.
Addendum: Reading O'Neil's book this weekend reminded me that there's a one-day training session at the upcoming ASSA annual meetings on ethics, scientific integrity, and responsible leadership in economics. Maybe I will see some of you there.
**Opinions here are mine and should not to be attributed to anyone with whom I work.**