Strategy for Grammar Construction



In the realm of language, compound verb forms can be subject to interruptions, allowing the insertion of various elements, such as adverbials and nominals. However, during the initial phase of grammar development, the focus is on considering compound verb forms as predominantly continuous entities, rather than discontinuous Parsing Bulgarian Verb Forms. This perspective is influenced by the paradigmatic nature of thinking at the early modeling stage of the verb complex. Unsupervised use of regular expressions to predict the potential occurrence of “external” elements within compound verb forms is not very effective. To handle the discontinuity of these forms, a careful set of rules needs to be established, requiring a thorough examination of syntagmatic patterns.

Longest Versus Shortest Match Principle

The ClaRK system’s cascaded regular grammar engine allows the identification of segments of varying lengths in a sequence, aiding in defining the structure and providing annotations for compound verb forms. A strategy is devised to choose the most suitable length and composition of examined constructs at intermediate parsing levels.

Rules and Category Assignment in Initial Grammar Development

Illustrating the methodology described in Section 3, the following rules are introduced for syntagmatic pattern matching.

Recognizing Verb Clusters with Small Words

The rule identifies segments comprising a full-content verb and a specific number and type of small words Destination Bulgaria. The right-hand side uses a regular expression indicating that a full-content verb cluster may consist of: A finite verb form inflected for present, aorist, or imperfect tense.
A non-finite verb form, which can be an indefinite form of the active aorist, active imperfect, or the passive participle. Any of the verb forms listed in items 1 and 2, preceded or followed by various sequences of small words. The left-hand side includes XML markup added to the rule’s output, enclosing the recognized pattern denoted by the variable Yw. Since the input words to the regular expression are the contents of nested XML elements, an Element Value in the form of an XPath expression is defined for the element serving as the context node of the computed tree structure.


Please enter your comment!
Please enter your name here