Human-Centric Artificial Intelligence

Why are biases so prevalent in AI?

Human-Centric Artificial Intelligence

By Daresay Team - May 20, 2019

Until recently, AI and machine learning were pretty much “left to themselves”. Systems were built and put to work without too much thought about the ethical consequences. This began to change a couple of years ago as the number of published articles and research papers discussing bias in machine learning began to rise dramatically.

“We do not see things as they are. We see things as we are.”

Rabbi Shemuel ben Nachmani, as quoted in the Talmudic tractate Berakhot (55B.C)

Of course today we can look back at some of the early AI tools and see exactly why certain issues occured but, like any fledgling technology, the only way to find out what will happen is to test it. And then test it again. What’s become apparent is that rather than countering many of today’s existing biases, AI tools are actually reinforcing them. This is no surprise; it’s humans that are behind the training data used in machine learning and as a species we are incapable of being unbiased.

Add to this the homogeneous make-up of many of the teams working with AI – whose prejudices and values are inherent in the learning process – and you can see why AI has developed certain biases. This may not be such a big deal if AI is being used to estimate the number of red blood cells in a human body, or to control specific phases of industrial production, however, when there is a negative impact on society or any individual within it, something has to change.

In 2018 Amazon finally scrapped its AI recruiting tool as it continued to show bias towards men. It had taught itself to do things such as penalise resumes that had the word “women’s” in them. The tool is now used for rudimentary tasks such as removing duplicate candidate profiles.

Read the full Reuters article: Amazon scraps secret AI recruiting tool that showed bias against women OCTOBER 10, 2018

What biases exist and how can we mitigate them in AI?

We must start by admitting that implicit biases exist. It’s no use claiming that you’re an objective person, always open to other people’s points of view. Your entire life – family, peers, schooling, work, experiences, etc. – has been instrumental in developing your biases. And that’s OK. You may not know what biases you have, but they implicitly exist and will impact the way you behave or create training data.

As humans, we also have cognitive biases, those that we share with one another even if the context of them may differ. In total there are over a hundred of these, including:

Confirmation bias – whereby we seek information that will confirm our beliefs and prejudices rather than oppose them. Ultimately this leads us to convert what are essentially neutral signals into subjective bias-forming signals.
Bandwagon effect – where we tend to believe what others do or believe in, something that the publishers of fake news have been able to exploit so effectively as information is instantly spread around the globe.
Hindsight bias – where we believe that our previous decisions were more correct or successful than they actually were.

In 2016 the American Courts’ COMPAS algorithm – which was used to predict a defendant’s risk of committing another crime – was investigated. It was found to be twice as likely to inaccurately predict that a black person would reoffend and twice as likely to inaccurately predict that a white person would not reoffend.

Read the full Pro Publica article: Machine Bias MAY 23, 2016

So how does this apply to AI?

A successful algorithm, is trained using data that accurately represents the environment it will operate in. Creating or providing unbiased training data samples, however, is extremely difficult. Sample data must correctly reflect the environment or scenario in which the algorithm will be used. Having said that, if you wish to change the status quo, providing historical data will not necessary achieve this. In the case of the Amazon recruiting AI tool, the algorithm was trained using data from past applications in what is a predominantly male occupation, therefore it considered males to be better suited for many jobs. This is a classic case of prejudice bias.

Hidden feedback loops are when scenarios become self-perpetuating. For instance, algorithms used in the US to predict areas where crimes are more likely to take place have led to more police being sent to patrol these areas, and thus more crimes being reported by police in these areas – crimes which may have gone unreported otherwise.

What can we do?

Again, we need to be aware that this type of bias exists in AI. That’s the first hurdle. The next job is to try and mitigate it. The ethical OS is a great place to start. It provides an ethics toolkit for tech in the future. Specific to algorithms we can find a series of basic questions that you can ask of your AI project:

Does this technology make use of deep data sets and machine learning? If so, are there gaps or historical biases in the data that might bias the technology?
Have you seen instances of personal or individual bias enter into your product’s algorithms?
How could these have been prevented or mitigated?
Is the technology reinforcing or amplifying existing bias?
Who is responsible for developing the algorithm? Is there a lack of diversity in the people responsible for the design of the technology?
How will you push back against a blind preference for automation (the assumption that AI-based systems and decisions are correct and don’t need to be verified or audited)?
Are your algorithms transparent to the people impacted by them? Is there any recourse for people who feel they have been incorrectly or unfairly assessed?

Using best practices and tools to minimise bias

Bias isn’t a new phenomenon. Mathematicians have been looking at ways of mitigating if for years. Therefore, many best practices used in statistics can be applied to an AI project, including:

More recently, leading tech firms have been developing open source tools to help developers mitigate bias. For auditing data sets and machine learning models these two are very useful:

AI is no different to any other side of tech; auditing and transparency are essential for trust and future development. This doesn’t necessary mean access to all data has to be shared, but it’s only through transparency that a fair and independent assessment of training data can be made. This, for instance, may have enabled the COMPAS algorithm to be “redirected in the right direction” at an early stage.

As with many things in tech, what starts with people pushing boundaries can quickly find its way into the mainstream or at least have an impact on the mainstream. For instance, just as biohackers are changing healthcare, there is a movement towards transparency and open source in AI.

Collaborating with AI

Ultimately, we, as humans, will have to learn to collaborate with AI as we move into a joint cognitive system. AI will help us make better decisions on a daily basis, but in order for this to be successful we need to mitigate bias. And do that we need to understand, accept and reflect on our biases, create more diverse work groups and audit training data. Do this right, and we can have truly effective augmented intelligence where man and machine work together.

This article is based on a Trend Talk by Daresayers Daniel Rönnlund and Daniel Sjöström “Human-Centric Artificial Intelligence”. Are you looking for an engaging speaker in tech or design?

Book a speaker from Daresay today