Are algorithms and big data prejudice?
A great article worth reading with highlights below:
Are Algorithms Building the New Infrastructure of Racism? - Issue 55: Trust - Nautilus
We don't know what our customers look like," said Craig Berman, vice president of global communications at Amazon, to…
Big-data practitioners understand that large, richly detailed datasets of the sort that Amazon and other corporations use to deliver custom-targeted services inevitably contain fingerprints of protected attributes like skin color, gender, and sexual and political orientation. The decisions that algorithms make on the basis of this data can, invisibly, turn on these attributes, in ways that are as inscrutable as they are unethical.
. . .
a fundamental question in algorithmic fairness is the degree to which algorithms can be made to understand the social and historical context of the data they use. “You can tell a human operator to try to take into account the way in which data is itself a representation of human history,” Crawford says. “How do you train a machine to do that?” Machines that can’t understand context in this way will, at best, merely pass on institutionalized discrimination — what’s called “bias in, bias out.”
. . .
Because persons that police classified as Afro-Americans were re-arrested more often in the training dataset, they claimed, COMPAS is justified in predicting that other persons classified as Afro-Americans by police — even in a different city, state, and time period — are more likely to be re-arrested. The cycling of classification into data then back into classification echoes W.E.B. Dubois’ 1923 definition, “the black man is the man who has to ride Jim Crow in Georgia.”
. . .
if prejudices are reflected, and thus transmitted, in the statistics of language itself, then the way we speak doesn’t just communicate the way we view each other, it constructs it. If de-biasing projects like Bolukbasi’s can work, we can begin to shift our biases at scale and in a way that was previously impossible: with software. If they don’t, we face the danger of reinforcing and perpetuating those biases through a digital infrastructure that may last for generations.
. . .
He points out that Bolukbasi’s paper assumes that gender is binary, or at least that the connection between gendered words follows a straight line. “I don’t think [we] have any clue of how [debiasing] can work for a concept that is perhaps marginally more complex,” he cautions. He points in particular to racial stereotypes, where the very notion of categories is as problematic as the means used to define them.
. . .
A Google Image search for “CEO,” for instance, produces images that are overwhelmingly of white men. Narayanan says that these problems may be overlooked in discussions of fairness because “they are harder to formulate mathematically — in computer science, if you can’t study something in formal terms, its existence is not as legitimate as something you can turn into an equation or an algorithm.”
. . .
what gets chosen is usually whatever is easiest to quantify, rather than the fairest.
. . .
Consider how much dated technology already permeates our lives — air traffic control systems, for instance, run largely on software built in the 1970s. The recent “WannaCry” worm that crippled hospital systems across the United Kingdom exploited the fact that these systems ran on a decades-old version of Windows, which Microsoft wasn’t even bothering to maintain. A machine understanding of language, embedded in core services, could carry forward present prejudices for years or decades hence. In the words of artist Nicole Aptekar, “infrastructure beats intent.”
The greatest danger of the new digital infrastructure is not that it will decay, or be vulnerable to attack, but that its worst features will persist. It’s very hard to tear down a bridge once it’s up.
One of the founders and pioneers of virtual reality — Jaron Lanier — echos the article above when interviews on the Tavis Smiley show:
Virtual Reality Pioneer Jaron Lanier, Part 1 | Interviews | Tavis Smiley | PBS
Walmart Sponsor Ad] Announcer: And by contributions to your PBS station from viewers like you. Thank you. Tavis: So…
Okay, so the first thing to say is that technology and especially information technology is a very human endeavor. We like to pretend that we’re like in lab coats and that we’re doing this thing that’s very objective, but it isn’t. Like when we make algorithms, it reflects our assumptions and our culture.
And to the degree that we can’t diversify our own teams, we’re actually limiting our stuff and making our stuff worse. There’s so many examples of that. Like I saw some research that indicated that virtual reality worked better for men than for women and the researchers claimed this is intrinsic.
And I said, well, look at the teams who made the particular tests you’re using. Oh, guess what? There’s no diversity on the team. Try it with stuff from diverse teams. Oh, all of a sudden, it works. So this shouldn’t be a surprise, right?
So there’s a way that an initial bias or initial exclusion compounds itself over and over again. So it’s absolutely critical not to let it get started.
. . .
There’s another level of this which is pretty dark. I’ll do my best to explain this really quickly.
The way the algorithms work on social media and in general with what we call advertising, the behavior modification loop business plan, is you have to give people stimulus from whatever they have, whether it’s a social media feed or whatever, that keeps them engaged.
This is engagement, right? So what keeps you engaged? Unfortunately, negative emotions, fear, anxiety, anger, these things are more engaging, more immediately and more persistently engaging, okay?
So I will ask you a question. Why is it that there have been so many phenomenon where people use social media and it seems to be creating positive social change and then, just like a year later, there’s this backlash that’s worse than anybody could have imagined?
I can mention a few examples. The Arab Spring was the first prominent example, but I also want to mention Black Lives Matter. So what’s going on is that the people — there’s sort of two levels to what’s going on.
On the surface level, which is what people see, these things are incredibly beautiful. Like I personally found Black Lives Matter to be incredibly moving and I think like Black Twitter is like major literature, like it’s literature for the ages. Really, it’s astonishingly beautiful.
But the thing is, behind the scenes, there’s a completely different game going on that has nothing to do with any of that. What’s happening is that all the content, the energy, the fuel that’s coming in from movements of this kind has to be processed in such a way as to generate engagement and profits for the machine.
So there’s no like evil genius doing this. It’s just algorithmic, so it gets processed to be turned into negative emotions for somebody because that’s the most efficient way to use the fuel.
So then what you have is this primed thing where it’s somehow packaged in order to irritate as many people as possible and, because the negative emotions are more powerful, the backlash that arises which would probably not have been stimulated otherwise is even greater than the initial thing.
So the reaction online from something like Black Lives Matter will always be more intense than Black Lives Matter…
Another example can be found in my article on “Digital Hegemony” in which Google search results return foreign websites rather than local. “This gives rise to a form of digital hegemony, whereby producers in a few countries get to define what is read by others.”