Abstract
We proposebidirectional impartingorBiImp, a generalized method for aligning embeddingdimensions with concepts during the embedding learning phase. While preserving the semanticstructure of the embedding space, BiImp makes dimensions interpretable, which has a criticalrole in deciphering the black-box behavior of word embeddings. BiImp separately utilizes bothdirections of a vector space dimension: each direction can be assigned to a different concept.This increases the number of concepts that can be represented in the embedding space. Ourexperimental results demonstrate the interpretability of BiImp embeddings without makingcompromises on the semantic task performance. We also use BiImp to reduce gender biasin word embeddings by encoding gender-opposite concepts (e.g., male-female) in a singleembedding dimension. These results highlight the potential of BiImp in reducing biases andstereotypes present in word embeddings. Furthermore, task or domain-specific interpretableword embeddings can be obtained by adjusting the corresponding word groups in embeddingdimensions according to task or domain. As a result, BiImp offers wide liberty in studying wordembeddings without any further effort
Dokumententyp: | Zeitschriftenartikel |
---|---|
Fakultät: | Sprach- und Literaturwissenschaften > Department 2 |
Themengebiete: | 400 Sprache > 400 Sprache |
ISSN: | 0306-4573 |
Sprache: | Englisch |
Dokumenten ID: | 110528 |
Datum der Veröffentlichung auf Open Access LMU: | 02. Apr. 2024, 07:18 |
Letzte Änderungen: | 02. Apr. 2024, 07:18 |