Abstract
We describe LMU Munich's machine translation system for German -> Czech translation which was used to participate in the WMT19 shared task on unsupervised news translation. We train our model using monolingual data only from both languages. The final model is an unsupervised neural model using established techniques for unsupervised translation such as denoising autoencoding and online back-translation. We bootstrap the model with masked language model pretraining and enhance it with back-translations from an unsupervised phrase-based system which is itself bootstrapped using unsupervised bilingual word embeddings.
Dokumententyp: | Zeitschriftenartikel |
---|---|
Fakultätsübergreifende Einrichtungen: | Centrum für Informations- und Sprachverarbeitung (CIS) |
Themengebiete: | 400 Sprache > 400 Sprache |
Sprache: | Englisch |
Dokumenten ID: | 84245 |
Datum der Veröffentlichung auf Open Access LMU: | 15. Dez. 2021, 15:10 |
Letzte Änderungen: | 15. Dez. 2021, 15:10 |