Home Tags Sources of Linguistic Knowledge

Tag: Sources of Linguistic Knowledge

Sources of Linguistic Knowledge

0

Sources of Linguistic Knowledge and Grammar Writing Facilities

When tasked with constructing a grammar for parsing compound verb forms, the BulTreeBank project team provides a special-purpose corpus of one million word tokens, sourced from newspapers and organized in XML documents with TEI-conformant markup at the paragraph level. These texts undergo processing by a morphological analyzer and manual disambiguation using the constraint system in ClaRK (Simov et al. 2002a) Strategy for Grammar Construction. The electronic lexicon (Popov et al. 1998) used for morphosyntactic analysis contains entries for single words, limiting information about verb tense, mood, and voice to those present in single verb forms.

The encoded information includes three verb tenses (present, aorist, and imperfect), imperative forms for mood, and certain special conditional forms for the auxiliary verb “sam” (‘to be’).