The process of building a grammar initiates with an abstraction from grammar books, typically intended for human understanding and not tailored for direct practical applications in software. Grammar books and paper dictionaries lack comprehensive sets of data structures essential for real-life software applications. Despite this limitation, a grammar is constructed through deductive inference, utilizing grammar books and the writer’s language competence. The outcome serves as an initial attempt to address parsing challenges within the realm of language, particularly focusing on Bulgarian verb forms Sources of Linguistic Knowledge. The principles guiding the initial phase of grammar construction include:
Exhaustiveness
The grammar writer considers all constructs representing tense, mood, and voice, encompassing simplex forms, various combinations of finite and non-finite auxiliary and main verb forms. Positive forms take precedence, followed by forms of predicate negation and interrogation introduced by the negative particle “ne” and the interrogative particle “li.” The complexity increases with the inclusion of short pronominals in compound forms.
Definition of Key Morphosyntactic Features and Combinations
Morphosyntactic tags attached to verb tokens in the text comprise values for multiple features that differentiate verb paradigms Destinations Bulgaria. When crafting regular expressions, it is crucial to specify relevant feature values essential for accurate pattern matching. For instance, lexical features like Aspect and Transitivity are irrelevant and represented by wildcards in input words for regular expressions. Simultaneously, explicit mention of indefinite forms of past active and passive participles is necessary, as they can be components of tense, mood, or voice forms. Definite forms, applicable to noun or adjective phrases, are excluded, considering the morphological expression of definiteness in Bulgarian through a definite article as a word form-ending. A detailed description of morphosyntactic specifications for verbs within the BulTreeBank tagset is provided in Slavcheva 2003.