BeliefBank: Adding Memory to a Pre-Trained Language Model for a Systematic Notion of Belief

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Kassner, Nora; Tafjord, Oyvind; Schütze, Hinrich und Clark, Peter (November 2021): BeliefBank: Adding Memory to a Pre-Trained Language Model for a Systematic Notion of Belief. 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic, November 7-11, 2021. Moens, Marie-Francine; Huang, Xuanjing; Specia, Lucia und Yih, Scott Wen-tau (Hrsg.): In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Stroudsburg, PA: Association for Computational Linguistics. S. 8849-8861 [PDF, 466kB]

Vorschau

DOI: 10.5282/ubm/epub.92197

Abstract

Although pretrained language models (PTLMs) contain significant amounts of world knowledge, they can still produce inconsistent answers to questions when probed, even after specialized training. As a result, it can be hard to identify what the model actually “believes” about the world, making it susceptible to inconsistent behavior and simple errors. Our goal is to reduce these problems. Our approach is to embed a PTLM in a broader system that also includes an evolving, symbolic memory of beliefs – a BeliefBank – that records but then may modify the raw PTLM answers. We describe two mechanisms to improve belief consistency in the overall system. First, a reasoning component – a weighted MaxSAT solver – revises beliefs that significantly clash with others. Second, a feedback component issues future queries to the PTLM using known beliefs as context. We show that, in a controlled experimental setting, these two mechanisms result in more consistent beliefs in the overall system, improving both the accuracy and consistency of its answers over time. This is significant as it is a first step towards PTLM-based architectures with a systematic notion of belief, enabling them to construct a more coherent picture of the world, and improve over time without model retraining.

Dokumententyp:	Konferenzbeitrag (Paper)
EU Funded Grant Agreement Number:	740516
EU-Projekte:	Horizon 2020 > ERC Grants > ERC Advanced Grant > ERC Grant 740516: NonSequeToR - Non-sequence models for tokenization replacement
Fakultätsübergreifende Einrichtungen:	Centrum für Informations- und Sprachverarbeitung (CIS)
Themengebiete:	000 Informatik, Informationswissenschaft, allgemeine Werke > 000 Informatik, Wissen, Systeme 400 Sprache > 400 Sprache 400 Sprache > 410 Linguistik
URN:	urn:nbn:de:bvb:19-epub-92197-2
Ort:	Stroudsburg, PA
Sprache:	Englisch
Dokumenten ID:	92197
Datum der Veröffentlichung auf Open Access LMU:	27. Mai 2022 09:37
Letzte Änderungen:	27. Mai 2022 09:47

Dokument bearbeiten