Computers and algorithms have become essential tools that pervade all aspects of our daily lives; this technology is based on data and, for it to be reliable, we have to make sure that the data on which it is based on is fair and without bias. In this context, Fairness has become a relevant topic of discussion within the field of Data Science Ethics, and in general in Data Science. Today's applications should therefore be associated with tools to discover bias in data, in order to avoid (possibly unintentional) unethical behavior and consequences; as a result, technologies that accurately discover discrimination and bias in databases are of paramount importance. In this work we propose FAIR-DB (FunctionAl dependencIes to discoveR Data Bias), a novel solution to detect biases and discover discrimination in datasets, that exploits the notion of Functional Dependency, a particular type of constraint on the data. The proposed solution is implemented as a framework that focuses on the mining of such dependencies, also proposing some new metrics for evaluating the bias found in the input dataset. Our tool can identify the attributes of the database that encompass discrimination (e.g. gender, ethnicity or religion) and the ones that instead verify various fairness measures; moreover, based on special aspects of these metrics and the intrinsic nature of dependencies, the framework provides very precise information about the groups treated unequally, obtaining more insights regarding the bias present in dataset compared to other existing tools. Finally, our system also suggests possible future steps, by indicating the most appropriate (already existing) algorithms to correct the dataset on the basis of the computed results.

FAIR-DB: Function Al dependencies to discover data bias

Azzalini F.;Criscuolo C.;Tanca L.
2021-01-01

Abstract

Computers and algorithms have become essential tools that pervade all aspects of our daily lives; this technology is based on data and, for it to be reliable, we have to make sure that the data on which it is based on is fair and without bias. In this context, Fairness has become a relevant topic of discussion within the field of Data Science Ethics, and in general in Data Science. Today's applications should therefore be associated with tools to discover bias in data, in order to avoid (possibly unintentional) unethical behavior and consequences; as a result, technologies that accurately discover discrimination and bias in databases are of paramount importance. In this work we propose FAIR-DB (FunctionAl dependencIes to discoveR Data Bias), a novel solution to detect biases and discover discrimination in datasets, that exploits the notion of Functional Dependency, a particular type of constraint on the data. The proposed solution is implemented as a framework that focuses on the mining of such dependencies, also proposing some new metrics for evaluating the bias found in the input dataset. Our tool can identify the attributes of the database that encompass discrimination (e.g. gender, ethnicity or religion) and the ones that instead verify various fairness measures; moreover, based on special aspects of these metrics and the intrinsic nature of dependencies, the framework provides very precise information about the groups treated unequally, obtaining more insights regarding the bias present in dataset compared to other existing tools. Finally, our system also suggests possible future steps, by indicating the most appropriate (already existing) algorithms to correct the dataset on the basis of the computed results.
2021
CEUR Workshop Proceedings
File in questo prodotto:
File Dimensione Formato  
PIE+Q_4.pdf

accesso aperto

Dimensione 1.08 MB
Formato Adobe PDF
1.08 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1260514
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact