Our world is on its way to self-driving cars, robot home assistants, and even fully automated entertainment. Although AI is currently in its infant stages, it is still able to carry out simple but useful daily tasks. Smart home assistants turn devices on and off with voice command, vacuum robots automatically clean your home, and AI programs automatically monitor and analyze data whether in finance, healthcare, or leisure. However, AI is only as good as the data it analyzes, and this creates a real concern as it can result in discrimination and other social consequences. For example, early AI speech recognition algorithms were not able to process African-American voices or ethnic accents. Another example is genetic background programs not having enough data for non-Americans or non-Europeans. Programmers must work with researchers to understand algorithmic bias within their data to ensure the AI is able to perform to create equal opportunity.
Let’s explore a few cases of algorithmic bias:
Amazon Online Recruitment Tool
Engineers at Amazon created an algorithm for their AI recruitment tool that was taught to recognize word patterns in applications rather than relevant skill sets. This data set was then benchmarked against the company’s already predominantly male engineer applicants of 10 years to analyze an applicant’s fit. Eventually the AI software started to penalize words associated with women, such as education backgrounds with women’s colleges. In 2015, Amazon realized that their recruitment tool has created gender bias and had since dismissed the tool.
Correctional Offender Management Profiling for Alternative Sanctions (COMPAS)
US court systems used this algorithm to predict the likelihood a defendant would become a reoffender and assigned the length of detention periods to them while they waited for trial. COMPAS would create risk scores on defendants by analyzing arrest records, defendant demographics and other variables. The program predicted false positives for African-American offenders (45%) to be twice as many than white offenders (23%). This created a false narrative as non-African-Americans were equally likely to reoffend.
Facial Recognition Technology
According to MIT research Joy Buolamwini, training data sets in commercial facial recognition systems have been estimated to be 75% male and 80% Caucasian. While the programs were 99% accurate in recognizing Caucasian males, they had an error rate of 20-34% in recognizing dark skinned females. IBM and Microsoft both responded to Buolamwini’s findings with commitment on improving accuracy of dark skinned facial recognition.
Word Association Bias
By analyzing over 2.2 million words using commercial machine learning AI programs, Princeton University researchers discovered that European names were more positively associated than African-Americans names. They also found that words ‘woman’ and ‘girl’ were associated to arts, while males were associated with science and math. Researchers were able to prove that machine learning algorithms were emphasizing racial and gender biases shown by humans.
How can we improve this situation? Does the data need to be governed? By who and how? Does the algorithm need to be accounted for? How can programmers improve AI? What do you think can be done to reduce algorithmic bias?