BAStat : New Statistical Resources at the Bavarian Archive for Speech Signals

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Schiel, Florian (2010): BAStat : New Statistical Resources at the Bavarian Archive for Speech Signals. 7th International Conference on Language Resources and Evaluation (LREC), Valetta, Malta, 19. - 21. Mai 2010. Calzolari, Nicoletta (Hrsg.): In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Paris: S. 1069-1076 [PDF, 177kB]

Vorschau

DOI: 10.5282/ubm/epub.13681

Abstract

A new type of language resource ’BAStat’ has been released by the Bavarian Archive for Speech Signals. In contrast to primary resources like speech and text corpora BAStat comprises statistical estimates based on a number of primary resources: first and second order occurrence probability of phones, syllables and words, duration statistics, probabilities of pronunciation variants of words and probabilities of context information. Unlike other statistical speech resources BAStat is based solely on recordings of conversational German and therefore models spoken language. It consists of 7-bit ASCII tables and matrices to maximize inter-operability between different platforms and can be downloaded from the BAS web-site. This paper gives a detailed description about the empirical basis, the contained data types, some interesting interpretations and a brief comparison to the text-based statistical resource CELEX.

Dokumententyp:	Konferenzbeitrag (Paper)
Fakultät:	Sprach- und Literaturwissenschaften > Department 2 > Phonetik und Sprachverarbeitung
Themengebiete:	400 Sprache > 400 Sprache
URN:	urn:nbn:de:bvb:19-epub-13681-1
Ort:	Paris
Sprache:	Englisch
Dokumenten ID:	13681
Datum der Veröffentlichung auf Open Access LMU:	19. Jul. 2012 09:25
Letzte Änderungen:	04. Nov. 2020 12:54

Dokument bearbeiten