Algorithmic bias: From discrimination discovery to fairness-aware data mining

Sara Hajian, Francesco Bonchi, Carlos Castillo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

61 Citations (Scopus)

Abstract

Algorithms and decision making based on Big Data have become pervasive in all aspects of our daily lives lives (offline and online), as they have become essential tools in personal finance, health care, hiring, housing, education, and policies. It is therefore of societal and ethical importance to ask whether these algorithms can be discriminative on grounds such as gender, ethnicity, or health status. It turns out that the answer is positive: for instance, recent studies in the context of online advertising show that ads for high-income jobs are presented to men much more often than to women [5]; and ads for arrest records are significantly more likely to show up on searches for distinctively black names [16]. This algorithmic bias exists even when there is no discrimination intention in the developer of the algorithm. Sometimes it may be inherent to the data sources used (software making decisions based on data can reflect, or even amplify, the results of historical discrimination), but even when the sensitive attributes have been suppressed from the input, a well trained machine learning algorithm may still discriminate on the basis of such sensitive attributes because of correlations existing in the data. These considerations call for the development of data mining systems which are discrimination-conscious by-design. This is a novel and challenging research area for the data mining community. The aim of this tutorial is to survey algorithmic bias, presenting its most common variants, with an emphasis on the algorithmic techniques and key ideas developed to derive efficient solutions. The tutorial covers two main complementary approaches: algorithms for discrimination discovery and discrimination prevention by means of fairness-aware data mining. We conclude by summarizing promising paths for future research.

Original languageEnglish
Title of host publicationKDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages2125-2126
Number of pages2
Volume13-17-August-2016
ISBN (Electronic)9781450342322
DOIs
Publication statusPublished - 13 Aug 2016
Externally publishedYes
Event22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016 - San Francisco, United States
Duration: 13 Aug 201617 Aug 2016

Other

Other22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016
CountryUnited States
CitySan Francisco
Period13/8/1617/8/16

Fingerprint

Data mining
Decision making
Finance
Health care
Learning algorithms
Learning systems
Marketing
Education
Health

Keywords

  • Algorithmic bias
  • Discrimination discovery
  • Discrimination prevention

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Hajian, S., Bonchi, F., & Castillo, C. (2016). Algorithmic bias: From discrimination discovery to fairness-aware data mining. In KDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Vol. 13-17-August-2016, pp. 2125-2126). Association for Computing Machinery. https://doi.org/10.1145/2939672.2945386

Algorithmic bias : From discrimination discovery to fairness-aware data mining. / Hajian, Sara; Bonchi, Francesco; Castillo, Carlos.

KDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Vol. 13-17-August-2016 Association for Computing Machinery, 2016. p. 2125-2126.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hajian, S, Bonchi, F & Castillo, C 2016, Algorithmic bias: From discrimination discovery to fairness-aware data mining. in KDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. vol. 13-17-August-2016, Association for Computing Machinery, pp. 2125-2126, 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, San Francisco, United States, 13/8/16. https://doi.org/10.1145/2939672.2945386
Hajian S, Bonchi F, Castillo C. Algorithmic bias: From discrimination discovery to fairness-aware data mining. In KDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Vol. 13-17-August-2016. Association for Computing Machinery. 2016. p. 2125-2126 https://doi.org/10.1145/2939672.2945386
Hajian, Sara ; Bonchi, Francesco ; Castillo, Carlos. / Algorithmic bias : From discrimination discovery to fairness-aware data mining. KDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Vol. 13-17-August-2016 Association for Computing Machinery, 2016. pp. 2125-2126
@inproceedings{1e4761553a8144d9950ca7f660f2aa84,
title = "Algorithmic bias: From discrimination discovery to fairness-aware data mining",
abstract = "Algorithms and decision making based on Big Data have become pervasive in all aspects of our daily lives lives (offline and online), as they have become essential tools in personal finance, health care, hiring, housing, education, and policies. It is therefore of societal and ethical importance to ask whether these algorithms can be discriminative on grounds such as gender, ethnicity, or health status. It turns out that the answer is positive: for instance, recent studies in the context of online advertising show that ads for high-income jobs are presented to men much more often than to women [5]; and ads for arrest records are significantly more likely to show up on searches for distinctively black names [16]. This algorithmic bias exists even when there is no discrimination intention in the developer of the algorithm. Sometimes it may be inherent to the data sources used (software making decisions based on data can reflect, or even amplify, the results of historical discrimination), but even when the sensitive attributes have been suppressed from the input, a well trained machine learning algorithm may still discriminate on the basis of such sensitive attributes because of correlations existing in the data. These considerations call for the development of data mining systems which are discrimination-conscious by-design. This is a novel and challenging research area for the data mining community. The aim of this tutorial is to survey algorithmic bias, presenting its most common variants, with an emphasis on the algorithmic techniques and key ideas developed to derive efficient solutions. The tutorial covers two main complementary approaches: algorithms for discrimination discovery and discrimination prevention by means of fairness-aware data mining. We conclude by summarizing promising paths for future research.",
keywords = "Algorithmic bias, Discrimination discovery, Discrimination prevention",
author = "Sara Hajian and Francesco Bonchi and Carlos Castillo",
year = "2016",
month = "8",
day = "13",
doi = "10.1145/2939672.2945386",
language = "English",
volume = "13-17-August-2016",
pages = "2125--2126",
booktitle = "KDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Algorithmic bias

T2 - From discrimination discovery to fairness-aware data mining

AU - Hajian, Sara

AU - Bonchi, Francesco

AU - Castillo, Carlos

PY - 2016/8/13

Y1 - 2016/8/13

N2 - Algorithms and decision making based on Big Data have become pervasive in all aspects of our daily lives lives (offline and online), as they have become essential tools in personal finance, health care, hiring, housing, education, and policies. It is therefore of societal and ethical importance to ask whether these algorithms can be discriminative on grounds such as gender, ethnicity, or health status. It turns out that the answer is positive: for instance, recent studies in the context of online advertising show that ads for high-income jobs are presented to men much more often than to women [5]; and ads for arrest records are significantly more likely to show up on searches for distinctively black names [16]. This algorithmic bias exists even when there is no discrimination intention in the developer of the algorithm. Sometimes it may be inherent to the data sources used (software making decisions based on data can reflect, or even amplify, the results of historical discrimination), but even when the sensitive attributes have been suppressed from the input, a well trained machine learning algorithm may still discriminate on the basis of such sensitive attributes because of correlations existing in the data. These considerations call for the development of data mining systems which are discrimination-conscious by-design. This is a novel and challenging research area for the data mining community. The aim of this tutorial is to survey algorithmic bias, presenting its most common variants, with an emphasis on the algorithmic techniques and key ideas developed to derive efficient solutions. The tutorial covers two main complementary approaches: algorithms for discrimination discovery and discrimination prevention by means of fairness-aware data mining. We conclude by summarizing promising paths for future research.

AB - Algorithms and decision making based on Big Data have become pervasive in all aspects of our daily lives lives (offline and online), as they have become essential tools in personal finance, health care, hiring, housing, education, and policies. It is therefore of societal and ethical importance to ask whether these algorithms can be discriminative on grounds such as gender, ethnicity, or health status. It turns out that the answer is positive: for instance, recent studies in the context of online advertising show that ads for high-income jobs are presented to men much more often than to women [5]; and ads for arrest records are significantly more likely to show up on searches for distinctively black names [16]. This algorithmic bias exists even when there is no discrimination intention in the developer of the algorithm. Sometimes it may be inherent to the data sources used (software making decisions based on data can reflect, or even amplify, the results of historical discrimination), but even when the sensitive attributes have been suppressed from the input, a well trained machine learning algorithm may still discriminate on the basis of such sensitive attributes because of correlations existing in the data. These considerations call for the development of data mining systems which are discrimination-conscious by-design. This is a novel and challenging research area for the data mining community. The aim of this tutorial is to survey algorithmic bias, presenting its most common variants, with an emphasis on the algorithmic techniques and key ideas developed to derive efficient solutions. The tutorial covers two main complementary approaches: algorithms for discrimination discovery and discrimination prevention by means of fairness-aware data mining. We conclude by summarizing promising paths for future research.

KW - Algorithmic bias

KW - Discrimination discovery

KW - Discrimination prevention

UR - http://www.scopus.com/inward/record.url?scp=84984972817&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84984972817&partnerID=8YFLogxK

U2 - 10.1145/2939672.2945386

DO - 10.1145/2939672.2945386

M3 - Conference contribution

AN - SCOPUS:84984972817

VL - 13-17-August-2016

SP - 2125

EP - 2126

BT - KDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

PB - Association for Computing Machinery

ER -