Abstract
The Levenshtein distance is an established metric to represent phonological distances between dialects. So far, this metric has usually been applied on manually transcribed word lists. In this study we introduce several extensions of the Levenshtein distance by incorporating probabilistic edit costs as well as temporal alignment costs. We tested all variants for compliance with the axioms that within-dialect utterance pairs are phonologically more similar than across-dialect ones. In contrast to former studies we are not applying the metrics on preselected, prototypical word lists but on real connected speech data which was automatically segmented and labeled. It turned out, that the transcription edit distances already performed well in reflecting the difference between within- and across-dialect comparisons, and that the adding of a temporal component rather weakens the performance of the metrics.
Dokumententyp: | Konferenzbeitrag (Paper) |
---|---|
Publikationsform: | Postprint |
Keywords: | dialect, distance metrics, Levenshtein, temporal distance |
Fakultät: | Sprach- und Literaturwissenschaften > Department 2 > Phonetik und Sprachverarbeitung |
Themengebiete: | 400 Sprache > 410 Linguistik |
URN: | urn:nbn:de:bvb:19-epub-18050-3 |
Sprache: | Deutsch |
Dokumenten ID: | 18050 |
Datum der Veröffentlichung auf Open Access LMU: | 27. Jan. 2014, 12:49 |
Letzte Änderungen: | 04. Nov. 2020, 12:59 |