Automatic correction of part-of-speech corpora

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Reichel, Uwe D. und Bucar Shigemori, Lia Saki (2008): Automatic correction of part-of-speech corpora. In: Speech and language technology, Bd. 11: S. 167-174 [PDF, 55kB]

Vorschau

Download (55kB)

DOI: 10.5282/ubm/epub.13565

Abstract

In this study a simple method for automatic correction of part-ofspeech corpora is presented, which works as follows: Initially two or more already available part-of-speech taggers are applied on the data. Then a sample of differing outputs is taken to train a classifier to predict for each difference which of the taggers (if any) delivered the correct output. As classifiers we employed instance-based learning, a C4.5 decision tree and a Bayesian classifier. Their performances ranged from 59.1 % to 67.3 %. Training on the automatically corrected data finally lead to significant improvements in tagger performance.

Dokumententyp:	Zeitschriftenartikel
Fakultät:	Sprach- und Literaturwissenschaften > Department 2 > Phonetik und Sprachverarbeitung
Themengebiete:	400 Sprache > 400 Sprache
URN:	urn:nbn:de:bvb:19-epub-13565-7
Sprache:	Englisch
Dokumenten ID:	13565
Datum der Veröffentlichung auf Open Access LMU:	13. Jul. 2012, 08:10
Letzte Änderungen:	04. Nov. 2020, 12:54

Dokument bearbeiten