LatMor: A Latin Finite-State Morphology Encoding Vowel Quantity

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Springmann, Uwe; Schmid, Helmut und Najock, Dietmar (2016): LatMor: A Latin Finite-State Morphology Encoding Vowel Quantity. In: Open Linguistics, Bd. 2, Nr. 1: S. 386-392 [PDF, 351kB]

[thumbnail of [Open_Linguistics]_LatMor_A_Latin_Finite-State_MorphologyEncoding_Vowel_Quantity.pdf]

Vorschau

DOI: 10.1515/opli-2016-0019

Abstract

We present the first large-coverage finite-state open-source morphology for Latin (called LatMor) which parses as well as generates vowel quantity information. LatMor is based on the Berlin Latin Lexicon comprising about 70,000 lemmata of classical Latin compiled by the group of Dietmar Najock in their work on concordances of Latin authors (see Rapsch and Najock, 1991) which was recently updated by us. Compared to the well-known Morpheus system of Crane (1991, 1998), which is written in the C programming language, based on 50,000 lemmata of Lewis and Short (1907), not well documented and therefore not easily extended, our new morphology has a larger vocabulary, is about 60 to 1200 times faster and is built in the form of finite-state transducers which can analyze as well as generate wordforms and represent the state-of-the-art implementation method in computational morphology. The current coverage of LatMor is evaluated against Morpheus and other existing systems (some of which are not openly accessible), and is shown to rank first among all systems together with the Pisa LEMLAT morphology (not yet openly accessible). Recall has been analyzed taking the Latin Dependency Treebank(1) as gold data and the remaining defect classes have been identified. LatMor is available under an open source licence to allow its wide usage by all interested parties.

Dokumententyp:	Zeitschriftenartikel
Fakultät:	Sprach- und Literaturwissenschaften
Themengebiete:	400 Sprache > 400 Sprache
URN:	urn:nbn:de:bvb:19-epub-47162-5
ISSN:	2300-9969
Sprache:	Englisch
Dokumenten ID:	47162
Datum der Veröffentlichung auf Open Access LMU:	27. Apr. 2018 08:12
Letzte Änderungen:	04. Nov. 2020 13:24

Dokument bearbeiten