As the A’ Level results scandal shows, an artificial intelligence can easily pick up and echo biases inherent in the data it was trained on.
It isn’t new. Plenty of the world’s police forces already rely on AIs with serious racial biases. So can we create fairer AIs, and can the many dodgy AIs already out there be fixed before the injustices really start to stack up? Here’s some science.
Does better accuracy really mean less intelligence?
There’s a big problem with the data that we train AIs on. It’s full of historical prejudice, which matters when AIs are already widely used by the criminal justice system, recruiters, and most recently in exam results.
Rogue AIs help keep prejudices about gender and race, social status and criminality alive. On the other hand convention says when you make an AI less biased, you also make it less intelligent. Luckily it looks like a new way of testing artificial intelligences might see algorithms becoming more fair as well as more effective.
How to make AI less biased
There are already ways to make artificial intelligences less biased, maybe pre-processing the data used to train them to take out bias at source. But the results tend to be less precise… or do they? Apparently not. It looks like the trade-off between accuracy and fairness is ‘a kind of illusion’, according to Sanghamitra Dutta from Carnegie Mellon University, USA.
The problem goes like this. When you use current fair training techniques to improve an AI then test it using the original biased data, it appears less accurate. It actually makes a lot more sense to test an AI using an ideal dataset, where the false trade-off between accuracy and fairness disappears.
Sanghamitra Dutta has found a way to create ideal datasets using Information Theory, a type of maths. Soon we’ll be able to balance the amount of information the AI has on every group and get a ‘statistical guarantee’ of fairness.
Here’s an example
Imagine you work for a company that hires mostly men. You’d maybe add a group of fictional female records to get an accurate balance of information across both genders, and a fair dataset to train your AI on.
The technique can also be used to evaluate how fair – or not – an existing AI is. When a fair algorithm performs much better using an ideal dataset, it’s clear there’s a bias in the base data used to train it.