Improving Isochronous Machine Translation with Target Factors and Auxiliary Counters

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Pal, Proyag; Thompson, Brian; Virkar, Yogesh; Mathur, Prashant; Chronopoulou, Alexandra ORCID: https://orcid.org/0000-0002-7379-7677 und Federico, Marcello (2023): Improving Isochronous Machine Translation with Target Factors and Auxiliary Counters. Interspeech Conference, Dublin, Ireland, 20. - 24. August 2023. International Speech Communication Association (Hrsg.), In: 24th Annual Conference of the International Speech Communication Association (INTERSPEECH 2023), Bd. 1 New York: Curran Associates. S. 37-41

Volltext auf 'Open Access LMU' nicht verfügbar.

DOI: 10.21437/Interspeech.2023-1063

Abstract

To translate speech for automatic dubbing, machine translation needs to be isochronous, i.e. translated speech needs to be aligned with the source in terms of speech durations. We introduce target factors in a transformer model to predict durations jointly with target language phoneme sequences. We also introduce auxiliary counters to help the decoder to keep track of the timing information while generating target phonemes. We show that our model improves translation quality and isochrony compared to previous work where the translation model is instead trained to predict interleaved sequences of phonemes and durations.

Dokumententyp:	Konferenzbeitrag (Paper)
Fakultätsübergreifende Einrichtungen:	Centrum für Informations- und Sprachverarbeitung (CIS)
Themengebiete:	000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik 400 Sprache > 410 Linguistik
ISBN:	978-1-7138-8880-2
ISSN:	2958-1796
Ort:	New York
Sprache:	Englisch
Dokumenten ID:	123789
Datum der Veröffentlichung auf Open Access LMU:	25. Feb. 2025 15:38
Letzte Änderungen:	25. Feb. 2025 15:38

Dokument bearbeiten