Modular and Parameter-Efficient Multimodal Fusion with Prompting

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Liang, Sheng; Zhao, Mengjie und Schütze, Hinrich (Mai 2022): Modular and Parameter-Efficient Multimodal Fusion with Prompting. ACL 2022, Dublin, Ireland, May 2022. Muresan, Smaranda; N, Preslav und Villavicencio, Aline (Hrsg.): In: Findings of the Association for Computational Linguistics: ACL 2022, Stroudsburg, PA: Association for Computational Linguistics. 2976 -2985 [PDF, 858kB]

[thumbnail of 2022.findings-acl.234.pdf]

Vorschau

DOI: 10.5282/ubm/epub.92202

Abstract

Recent research has made impressive progress in large-scale multimodal pre-training. In the context of the rapid growth of model size, it is necessary to seek efficient and flexible methods other than finetuning. In this paper, we propose to use prompt vectors to align the modalities. Our method achieves comparable performance to several other multimodal fusion methods in low-resource settings. We further show that our method is modular and parameter-efficient for processing tasks involving two or more data modalities.

Dokumententyp:	Konferenzbeitrag (Paper)
EU Funded Grant Agreement Number:	740516
EU-Projekte:	Horizon 2020 > ERC Grants > ERC Advanced Grant > ERC Grant 740516: NonSequeToR - Non-sequence models for tokenization replacement
Fakultätsübergreifende Einrichtungen:	Centrum für Informations- und Sprachverarbeitung (CIS)
Themengebiete:	000 Informatik, Informationswissenschaft, allgemeine Werke > 000 Informatik, Wissen, Systeme 400 Sprache > 400 Sprache 400 Sprache > 410 Linguistik
URN:	urn:nbn:de:bvb:19-epub-92202-4
Ort:	Stroudsburg, PA
Sprache:	Englisch
Dokumenten ID:	92202
Datum der Veröffentlichung auf Open Access LMU:	27. Mai 2022 10:08
Letzte Änderungen:	27. Mai 2022 10:08

Dokument bearbeiten