Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Schick, Timo; Udupa, Sahana ORCID: https://orcid.org/0000-0003-3647-9570 und Schütze, Hinrich (17. Dezember 2021): Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP. In: Transactions of the Association for Computational Linguistics, Bd. 9: S. 1408-1424 [PDF, 411kB]

Vorschau

Creative Commons: Namensnennung 4.0 (CC-BY)

DOI: 10.5282/ubm/epub.92231

Externer Volltext: https://transacl.org/index.php/tacl/article/view/3071

Abstract

When trained on large, unfiltered crawls from the Internet, language models pick up and reproduce all kinds of undesirable biases that can be found in the data: They often generate racist, sexist, violent, or otherwise toxic language. As large models require millions of training examples to achieve good performance, it is difficult to completely prevent them from being exposed to such content. In this paper, we first demonstrate a surprising finding: Pretrained language models recognize, to a considerable degree, their undesirable biases and the toxicity of the content they produce. We refer to this capability as self-diagnosis. Based on this finding, we then propose a decoding algorithm that, given only a textual description of the undesired behavior, reduces the probability of a language model producing problematic text. We refer to this approach as self-debiasing. Self-debiasing does not rely on manually curated word lists, nor does it require any training data or changes to the model’s parameters. While we by no means eliminate the issue of language models generating biased text, we believe our approach to be an important step in this direction.

Dokumententyp:	Zeitschriftenartikel
EU Funded Grant Agreement Number:	740516
EU-Projekte:	Horizon 2020 > ERC Grants > ERC Advanced Grant > ERC Grant 740516: NonSequeToR - Non-sequence models for tokenization replacement
Publikationsform:	Publisher's Version
Fakultät:	Kulturwissenschaften > Department für Kulturwissenschaften und Altertumskunde > Ethnologie
Fakultätsübergreifende Einrichtungen:	Centrum für Informations- und Sprachverarbeitung (CIS)
Themengebiete:	000 Informatik, Informationswissenschaft, allgemeine Werke > 000 Informatik, Wissen, Systeme 400 Sprache > 400 Sprache 400 Sprache > 410 Linguistik
URN:	urn:nbn:de:bvb:19-epub-92231-5
Sprache:	Englisch
Dokumenten ID:	92231
Datum der Veröffentlichung auf Open Access LMU:	01. Jun. 2022 05:20
Letzte Änderungen:	01. Jun. 2022 05:20

Dokument bearbeiten