Logo Logo
Hilfe
Hilfe
Switch Language to English

Liang, Sheng; Zhao, Mengjie und Schütze, Hinrich (Mai 2022): Modular and Parameter-Efficient Multimodal Fusion with Prompting. ACL 2022, Dublin, Ireland, May 2022. Muresan, Smaranda; N, Preslav und Villavicencio, Aline (Hrsg.): In: Findings of the Association for Computational Linguistics: ACL 2022, Stroudsburg, PA: Association for Computational Linguistics. 2976 -2985 [PDF, 858kB]

[thumbnail of 2022.findings-acl.234.pdf]
Vorschau
Download (858kB)

Abstract

Recent research has made impressive progress in large-scale multimodal pre-training. In the context of the rapid growth of model size, it is necessary to seek efficient and flexible methods other than finetuning. In this paper, we propose to use prompt vectors to align the modalities. Our method achieves comparable performance to several other multimodal fusion methods in low-resource settings. We further show that our method is modular and parameter-efficient for processing tasks involving two or more data modalities.

Dokument bearbeiten Dokument bearbeiten