When Algorithms Go Astray


The tale of Marc Elliott's algorithm raises a specter about which we all have to be concerned in this era when companies are generating models and algorithms by the truckload.

Suppose that the good work of an analyst is hijacked or shared for use in ways it was never intended.

When I read the story in the Los Angeles Times, my mind drifted to those science/adventure novels where a researcher's laboratory work on a cure for a disease is turned against humanity by a villain.

The secondary use of Elliott's healthcare research doesn't threaten humanity. In fact, I don't want to judge whether it's reuse was good or bad, but it's a scenario worth remembering.

Working for the research firm Rand Corp., Elliott helped to develop a method to spot disparities in the healthcare provided to minority patients. He was on a team of Rand researchers who developed Bayesian Surname Improved Geocoding more than a decade ago.

Faced with the reality that insurers and providers don't always collect race data from their clients, the tool draws inferences on race based on matching last names with addresses. The algorithm assigns a percentage indicating the likelihood that an individual is white, black, Hispanic, or Asian. According to the LA Times article, the method for pairing two "knowns" -- name and address -- has been pretty effective at identifying an "unknown" -- someone's race -- in the healthcare role for which it was intended.

So, Elliott -- who has a PhD in statistics -- was quite surprised when a friend emailed him in 2013, saying, “Did you know you just cost Ally Financial $80 million?”

Ally wasn't even in the healthcare space, being the finance arm of General Motors. Ally paid that sum to settle a racial discrimination case brought by the federal Consumer Financial Protection Bureau. That agency has been using the Rand formula to identify patterns of racial discrimination in consumer lending in the auto industry.

The Rand tool also has been the focus of criticism by Republicans who claim it is flawed, or "junk science." My concern over the use and reuse of Elliott's tool has little to do with the CFPB actions on their own. The issue is one of what else might happen with the work of today's data science teams, looking a year or a decade down the road.

The business world, including non-profits, has so many thousands of analytics initiatives in progress that it's going to become tough to know how they are going to be used in the future. Algorithms can be shared in a number of ways. However, the reality is that some of those algorithms will be reused in inappropriate ways, perhaps harming reputations and leading to legal action against innocents -- maybe even the original authors of the algorithm.

All of us express concern about our data and how it is shared and misused. Now there is a real possibility that the tools used to analyze that data will be shared and misused as well.

Have you seen other cases like that of the Rand tool?

James M. Connolly, Editor of All Analytics

Jim Connolly is a versatile and experienced technology journalist who has reported on IT trends for more than two decades. As editor of All Analytics he writes about the move to big data analytics and data-driven decision making. Over the years he has covered enterprise computing, the PC revolution, client/server, the evolution of the Internet, the rise of web-based business, and IT management. He has covered breaking industry news and has led teams focused on product reviews and technology trends. Throughout his tech journalism career, he has concentrated on serving the information needs of IT decision-makers in large organizations and has worked with those managers to help them learn from their peers and share their experiences in implementing leading-edge technologies through publications including Computerworld. Jim also has helped to launch a technology-focused startup, as one of the founding editors at TechTarget, and has served as editor of an established news organization focused on technology startups and the Boston-area venture capital sector at MassHighTech. A former crime reporter for the Boston Herald, he majored in journalism at Northeastern University.

Big Data Success Essentials: Tech, People, and Process

While there is an increasing focus on the role of people alongside technology in analytics initiatives, let's not forget that process -- business rules -- play an important role in big data success.

Why You Should Remember Equifax

The Equifax breach raises multiple concerns about how companies respond to hacks, but also how they handle third-party consumer data.


Re: Math is racist: How data is driving inequality
  • 9/9/2016 11:00:56 AM
NO RATINGS

The story reminds me of what's been circulating in our current political news where allegedly a real estate company coded rental applications to show race and the company got nailed by the authorities and eventually settled the case. Use of data and misuse is always going to need a boatload of lawyers to settle the issues.

Re: Math is racist: How data is driving inequality
  • 9/8/2016 2:26:21 PM
NO RATINGS

@Terry well said!

Re: Math is racist: How data is driving inequality
  • 9/8/2016 1:33:04 PM
NO RATINGS

There are different types of discrimination lenders of others might make. 

Overt discrimination exists when a lender openly and blatantly discriminates on a prohibited basis.

Disparate treatment occurs when a lender treats an applicant differently based on one of the prohibited bases.

Disparate impact occurs when a policy or practice applied equally to all applicants has a disproportionate adverse impact on applicants in a protected group.

The latter is when objective criteria can be used intentionally or unintentionally to discriminate.  

Re: Math is racist: How data is driving inequality
  • 9/8/2016 1:25:48 PM
NO RATINGS

@Seth

You've sidestepped the question.

My answer is that there are objective criteria that are not racist, even if the outcome from the algorithm may accidentally favor certain demographics.

Math is not racist. Like statistics, it can be misused, but in that case, it's the mathematician, not the math that is racist.

Re: Math is racist: How data is driving inequality
  • 9/8/2016 11:57:56 AM
NO RATINGS

@ PredictableChaos - Banks are required to take HMDA information for mortgages.  If the applicant doesn't want to answer the loan officer makes their best guest.  In the cases of low income minorities banks will sometimes lax the income guidelines in order to increase the amount of minorities in their loan portfolio.  This actually works out well because low income low income minorities  have a better record of paying on their mortgages than many high income borrowers because their homes are so precious to them they will do anything to keep them.  

Re: Math is racist: How data is driving inequality
  • 9/8/2016 11:41:47 AM
NO RATINGS

@Seth,

Yes, it's certainly possible to be racist in a covert way - classifying people by their accents or zip codes and then rejecting loans for people who the bank has reason to think are a certain race.

Let's talk about a different case - with no overt or covert thought about race, where the bank seeks to loan to all races and only uses objective criteria that will predict their future default rates.

Is it impossible to discriminate based on objective criteria (like on-time payment history of the individual requesting a new loan)? Or is that also racist, because certain races may have poorer scores on that metric?

Re: Math is racist: How data is driving inequality
  • 9/8/2016 11:21:02 AM
NO RATINGS

@ Predictiable Chaos   Algorithms could use zip codes to figure out what neighborhoods an individual is.  Since neighborhoods are often segregated it would give a probability to what race someone is.  It could use names and other identifiers as well.   

Since humans can often guess someone's race by their voice or accent I would suspect that it will be possible for a computer to detect it also. 

Re: Math is racist: How data is driving inequality
  • 9/8/2016 10:37:42 AM
NO RATINGS

Math is racist.

If a lendor never considers or even knows race in deciding who who to lend to, but does consider on-time payment history - will that be racist?

For on-line lendors, it becomes possible to design an algorithm that is not and cannot be aware of race. Shouldn't that make it impossible to be racist?

Re: Math is racist: How data is driving inequality
  • 9/7/2016 11:44:13 AM
NO RATINGS

Yes, you could even say that certain algorithms help drive self-fulfilling prophecies, or as one learned speechwriter referred to it, the soft bigotry of low expectations.

This all really demonstrates the importance of examining our underlying assumptions when working with data -- and being willing to blow them up completely.

Math is racist: How data is driving inequality
  • 9/6/2016 7:57:36 PM
NO RATINGS

I'm reading this interesting article named "Math is racist: How data is driving inequality" byby Aimee Rawlins on CNN Money 09/06/16. 

A snippet of it "Denied a job because of a personality test? Too bad -- the algorithm said you wouldn't be a good fit. Charged a higher rate for a loan? Well, people in your zip code tend to be riskier borrowers. Received a harsher prison sentence? Here's the thing: Your friends and family have criminal records too, so you're likely to be a repeat offender. (Spoiler: The people on the receiving end of these messages don't actually get an explanation.)"

It goes to show how algorithms reinforce stereotypes and keep some in poverty. 

 

 

Page 1 / 3   >   >>
INFORMATION RESOURCES
ANALYTICS IN ACTION
CARTERTOONS
VIEW ALL +
QUICK POLL
VIEW ALL +