Authors:
Wenlong Sun
1
;
Olfa Nasraoui
1
and
Patrick Shafto
2
Affiliations:
1
Dept of Computer Engineering and Computer Science, University of Louisville, Louisville, KY and U.S.A.
;
2
Dept of Mathematics and Computer Science, Rutgers University - Newark, Newark, NJ and U.S.A.
Keyword(s):
Information Retrieval, Machine Learning, Bias, Iterative Learning.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Computational Intelligence
;
Evolutionary Computing
;
Information Extraction
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Machine Learning
;
Soft Computing
;
Symbolic Systems
;
User Profiling and Recommender Systems
Abstract:
Early supervised machine learning (ML) algorithms have used reliable labels from experts to build predictions. But recently, these algorithms have been increasingly receiving data from the general population in the form of labels, annotations, etc. The result is that algorithms are subject to bias that is born from ingesting unchecked information, such as biased samples and biased labels. Furthermore, people and algorithms are increasingly engaged in interactive processes wherein neither the human nor the algorithms receive unbiased data. Algorithms can also make biased predictions, known as algorithmic bias. We investigate three forms of iterated algorithmic bias and how they affect the performance of machine learning algorithms. Using controlled experiments on synthetic data, we found that the three different iterated bias modes do affect the models learned by ML algorithms. We also found that Iterated filter bias, which is prominent in personalized user interfaces, can limit human
s’ ability to discover relevant data.
(More)