# In this file I will develop the entire Loglan grammar on top of the phonetic proposal # Dated updates now to appear here: # nuu is an atomic A core and there is no nu-affix to A connectives and their kin # 1/20/2018 redefined CA cores to include a possible NU prefix. This allows more logically connected tenses, for example. # 1/13/2018 reorganized the internals of class PA in a way which should allow more things and not forbid anything legal now. # this is pursuant on an analysis of the classes NI and PA as phrases, rather than words, as I start writing a global lexicography # proposal document. Enforced explicit pauses after PA phrases appearing as arguments with a following modifier with an argument. # 12/30/2017 fixed a problem with name markers in the clas NameWord and made a slight change to the new option in NI (names # as dimensions). # 12/27/2017 installing an alternative treatment of acronyms under which they are simply names (suffix -n to acronyms in all uses). # supporting this requires no change at all to acronymic name usage (just use the -n versions with the usual rules for names), # and for dimension usage requires to be a name marker and support for PreName as an alternative suffix to NI. # 12/27/2017 Frivolously fooling with the capitalization conventions. They ought to work better now...but I could have broken something. # the main new idea was to require that a capitalized embedded letteral actually be followed by lowercase if it was preceded by lowercase # (with the obvious exception for a letteral followed by a letteral). Also changed the rules for diphthongs in cmapua to make all-caps # legal for cmapua. The general idea is that one can start with a capital letter and stay capitalized until one hits a lower case letter, # at which point one can jump back up to caps only at a juncture (after which you can remain capitalized) or temporarily for a vowel # after z- (after which lower case resumes) or an embedded literal (after which lowercase resumes). The total effect is that this allows # attested capitalization patterns in Loglan (including capitalization of embedded literals as in possessive articles and acronyms) # and also allows all-caps for individual words (attested in Leith but suppressed in my version) and supports capitalization of components # of names as in (by artful use of syllable breaks: Leith just has BeibiDjein, which does not work for me). # 12/26/2017 Installed (quotation of phonetically legal but so far non-Loglan words). I did not make a name marker, so if one were to # use it with names (where it isn't really appropriate), one would have to pause initially: . # I note in this connection that quotation of names with li...lu remains limited, since names by themselves are not # utterances: one needs the . I fixed this as an exception in the previous parser; I may do it here or I may # not, haven't decided. Single name words can be quoted with , of course, but not serial names. # 12/24/2017 Refined treatment of vowel pairs for Cvv-V cmapua units. First 12/24 version rather disastrously # broken: this should be fixed! # 12/23/2017 This is now completely commented, with minor local exceptions to which I will return later. # This document is the basis on which I will build all subsequent parsers, with due modifications to the comments. # The Python PEG engine and preamble files contain commands for constructing a Python parser from it directly. # 12/22/2017 major progress on commenting the grammar # yet later 12/20: no change in performance of the grammar, extensive commenting in the # grammar section. Considerable changes in arrangement: for example, vocatives, inverse vocatives, # and free modifiers are moved to a much earlier point. I'm hoping to get a genuinely almost readable # commented grammar... # later 12/20 starting the process of commenting and editing the grammar, starting # at basic sentence structures. Notably rewrote the class [keksent] more compactly, # one hopes with no actual effect on parses. # 12/20/2017 Do not require expression of pause after finally stressed cmapua before # vowel initial predicate as a comma, since the initial vowel signals the pause anyway. # Allow final stress in names. Fixed bug in CVVHiddenStress. Prevented # broken monosyllables in finally stressed CVV djifoa. refinement of caprule # 12/19/2017 seem to have had a versioning failure and lost the fix which requires # CVVy djifoa to be followed by complete complexes. Restored. # 12/18/2017 fixed a bug in treatment of stressed syllables in recognizing predicate starts. Also # narrowed the generalized VCCV rule to allow more of the quite unlikely space of predicates with lots # of vowels before the CC pair. Probably they should be banned (and none have ever been proposed with # more than three) but that rule is not the context in which to arbitrarily ban half of them. Some cleanup # of the display of parses, for which updated version of logicpreamble.py should also be uploaded. A refinement # to class "connective" checking that apparent logical connectives are not initial segments of predicates. # This has the effect of delaying the declaration of "connective" until after the declaration of # "predstart". # 12/17/2017 further refinement of the 12/16 version: a couple of bugs spotted. # 12/16/2017 There should be no change in parsing behavior, but the predstart ruleset is shorter # and more intelligible, and I realized that Complex doesnt need a check for the anti-slinkui test # (the requirement that certain initial CVC cmapua be y hypenated which replaces the slinkui test)) # at all: the way predstart works already ensures that initial CV cmapua fall off in the excluded # cases, the idea being that we test the front of a predicate without lookahead in all cases. Also # addressed the subtle point that one wasn't forced to pause after a predicate before following y # (not likely to arise as a problem). # 12/14/2017 Corrected vowel grouping to avoid paradoxical vowel triples which are default # grouped in a way which becomes illegal if made explicit. SyllableA really should contain a final # consonant: the previous form was messing up vowel grouping. Serious bug where end of djifoa # and syllable resolution of a predicate may fail to agree. I think I blocked this by ensuring that # final djifoa are not followed by vowels. Other fine tuning of the complex algorithm. Also had # to repair the check for CVCCCV and CVCCVV predicates. # 12/13/2017: added kie ( utterance ) kiu to class LiQuote. Did fine tuning to ensure # that cmapua streams stop before
  • or , that names can stop at double quotes or close # parentheses, and that the capitalization rule ignores opening parentheses as well as double # quotes. One can now adorn li lu with quotes (on the inside) in a reasonable way # and adorn kie kiu with parentheses (on the inside) in a reasonable way. One cannot # *replace* these words (or any words) with punctuation in my model of Loglan. Also, # updates to comments, and # (end of utterance) added as a marker of terminal punctuation. # END of dated updates # This is now done, in a first pass. That is, the grammar is adapted and appears to work, more or less. # What is needed is comments on the lexicography and the grammar...Phonetics has now pretty clearly been sorted # from the grammar (there are some places where the phonetics accept grammar information with regard to punctuation). # Alien text is now handled somewhat differently. Some issues to do with quoting names are not finalized and have not been tested. # I added -iy and -uy as VV forms allowed in general in cmapua but not in other words; they are always monosyllabic. What this # immediately allows me to do is to give Y a name which is not phonetically irregular! is supported: is too, now. # capitalization is roughly back to where it was in the original, but all-caps are allowed. # acronyms are liable to be horrible. # Fixed the recursion problem in a way which will not be visible in ordinary parses. Streams of cmapua will always # be broken at name or alien text markers (instead of using lookahead to check that we do not stand at the beginning # of a name word or alien text word). The next cycle will then check for a name or alien text, and also check for # badnamemarkers; no lookahead is happening while a stream of cmapua is being read except checking for # the markers of names and alien text. This will change the way phonetic parses look (streams of cmapua will # break (and sometimes resume) at name markers or alien text markers, but it will not change any grammatical # parses. #Part I Phonetics # Mod bugs, I have implemented all of Loglan phonetics as described in my proposal. Borrowing djifoa are pretty tricky. # I have now parsed all the words in the dictionary, and all single words of appropriate classes parse successfully. # I have added alien text and quotation constructions which do not conform to these rules; so actually # all Loglan text should parse, mod some punctuation and capitalization issues. The conventions for # alien text here are not the same as those in the current provisional parser. # I believe the conventions for forcing comma pauses before vowel initial cmapua and after names # except in special contexts have been enforced. In a full grammar, one probably would want # to disable pauses before vowel initial letterals (done). This grammar also does not support the lingering # irregularities in acronyms (and won't). # This grammar (in Part I) is entirely about phonetics: all it does is parse text into names (with associated initial # pauses or name markers), cmapua (qua unanalyzed streams of cmapua units), # borrowings and complexes, along with interspersed comma pauses and marks # of terminal punctuation. It does support conventions about where commas are required # and a simple capitalization rule. Streams of cmapua break when markers initial # in other forms are encountered (and may in some cases resume when the markers # are a deception). # a likely locus for odd bugs is the group of predstartX rules which detect apparent cmapua which # are actually preambles to predicates. These are tricky! (and I did indeed find some lingering # problems when I parsed the dictionary). Another reason to watch this rule predstart # is that it carries a lot of weight: !predstart is used as a lightweight test # that what follows is a cmapua (a point discussed in more detail later). # In reviewing this, I think that very little is different from 1990's Loglan (the borrowing djifoa # are post-1989 L1, but not my creation). Some things add precision without making anything in 1990's Loglan incorrect. # The requirement that syllabic consonants be doubled is new, and makes some 1990's Loglan names incorrect. # The requirement that names resolve into syllables is new, and makes some 1990's Loglan names incorrect, # usually because they end in three consonants. # The rule restricting final consonant pairs from being noncontinuant/continuant is new, but # does not affect any actual predicate ever proposed. # Enhancing the VccV rule to also forbid CVVV...ccV caused one predicate to be changed # ( became , and haiukre was a novelty anyway, using a new name for X in X-ray) # The exact definition of syllables and use of syllable breaks and stress marks is new (the close comma # was replaced with the hyphen, so Lo,is becomes Lo-is); but this does not make anything in 1990's Loglan # incorrect, it merely increases precision and makes phonetic transcript possible. # Forbidding doubled vowels in borrowings was new, was already approved, and caused us to change # to . # Formally allowing the CVccVV and CVcccV predicates without y-hyphens took a proposal in 2013 because # Appendix H was careless in describing their abandonment of the slinkui test, but the dictionary # makes it evident that this was their intent all along. The slinkui test had already been # abandoned in the 1990s. # Formally abandoning qwx was already something that the dictionary workers in the 1990's were working # on; we completed it. # Allowing glottal stop in vowel pairs and forbidding it as an allophone of pause is a new phonetic # feature in the proposal but not reflected in the parser, of course. Alternative pronunciations of # y and h and allowing h in final position are invisible or do not make any 1990's Loglan incorrect. # Permitting false name markers in names was already afoot in the 1990's and the basic outlines of our # approach were already in place. The rule requiring explicit pauses between a name marker not starting # a name word and the beginning of the next name word is new, but reflects something which was already # a fact about 1990's Loglan pronunciation: those pauses had to be made in speech # (and in the 1990's they had no tools to do relevant computer tests)! The requirement # that names resolve into syllables restricts which literal occurrences of name markers are actually # false name markers (the tail they induce in the name must itself resolve into syllables). # Working out the full details of borrowing djifoa was interesting: I'm not sure that I've done anything # *new* there; explicitly noting the stress shift in borrowing djifoa might be viewed as something # new but it is a logical consequence of JCB's permission to pause after a borrowing djifoa, which contains # explicit language about how it is to be stressed, and the # final definition of a borrowing djifoa as simply a borrowing followed by -y. The shift strikes # me as a really good idea anyway, because it marks djifoa with a pause after it as phonetically different # in an additional way other than ending with the very indistinct vowel y. My rules as given here do not # directly enforce the rule that a borrowing djifoa must be preceded by y but I think they indirectly # enforce it in all or almost all cases: the parser tries to read a borrowing djifoa before reading # any other kind of djifoa, so it is hard to see how to deploy a short djifoa in such a way that it would # fall off the head of a borrowing without using y. # These phonetics do not support certain irregularities in acronyms. We note that # it is now allowed to insert <, mue> into an acronym, which would be necessary for example # between a Ceo letteral and a following VCV letteral. #Sounds #all vowels V1 <- [aeiouyAEIOUY] #regular vowels V2 <- [aeiouAEIOU] #consonants C1 <- [bcdfghjklmnprstvzBCDFGHJKLMNPRSTVZ] # letters letter <- (![qwxQWX] [a-zA-Z]) # a capitalization convention which allows what our current one allows and also allows all-caps. # if case goes down from upper case to lower case, it can only go back up in certain cases. This # does allow capitalization of initial segments of words. There is a forward reference to the grammar # in that free capitalization of embedded literals is permitted, and capitalization of vowels # guarded with z in literals as in DaiNaizA. lowercase <- (![qwx] [a-z]) uppercase <- (![QWX] [A-Z]) caprule <- [\"(]? &([z] V1 (!uppercase/&TAI0)/lowercase TAI0 (!uppercase/&TAI0)/!(lowercase uppercase).) letter (&([z] V1 (!uppercase/&TAI0)/lowercase TAI0 (!uppercase/&TAI0)/!(lowercase uppercase).) (letter/juncture))* !(letter/juncture) # syllable markers: the hyphen is always medial so must be followed by a letter. # the stress marks can be syllable final and word final. A juncture is never followed # by another juncture. juncture <- (([-] &letter)/[\'*]) !juncture stress <- ['*] !juncture # terminal punctuation terminal <- ([.:?!;#]) # characters which can occur in words character <- (letter/juncture) # to really get all Loglan text, we should add the alien text constructions and the markers of alien text, # , , , and certain quotations which violate the phonetic rules. # we adopt the convention that all alien text may be but does not have to be enclosed in quotes. # it needs to be understood that in quoted alien text, whitespace is understood as <, y,>; in the unquoted # version this is shown explicitly. This handling of alien text is taken from the final 1990's treatment # of Linnaeans = foreign names, and extended by us to replace the impossible treatment of strong # quotation in 1989 Loglan. # this is a little different from what is allowed in the previous provisional parser, but similar. # A difference is that all the alien text markers are allowed to be followed by the same sorts of alien text. # the forms with and are required to have following quotes in written form to avoid # unintended parses, which otherwise become likely in case of typos in non-alien text cases. AlienText <- ([,]? [ ]+ [\"] (![\"].)+ [\"]/ [,]? [ ]+ (![, ]!terminal .)+ ([,]? [ ]+ [y] [,]? [ ]+ (![, ]!terminal .)+)*) AlienWord <- &caprule ([Hh] [Oo] [Ii] juncture? &([,]? [ ]+ [\"])/[Hh][Uu] juncture? [Ee] juncture? &([,]? [ ]+ [\"]) / [Ll] [Ii] juncture? [Ee]juncture?/[Ll] [Aa] juncture? [Oo]/[Ss] [Aa] juncture? [Oo]juncture?/[Ss] [Uu] juncture? [Ee]juncture?) AlienText # while reading streams of cmapua, the parser will watch for the markers of alien text. alienmarker <- ([Hh] [Oo] [Ii] juncture? &([,]? [ ]+ [\"])/[Hh][Uu] juncture? [Ee] juncture? &([,]? [ ]+ [\"]) / [Ll] [Ii] juncture? [Ee] juncture? /[Ll] [Aa] juncture? [Oo] juncture? /[Ss] [Aa] juncture? [Oo] juncture?/[Ss] [Uu] juncture? [Ee] juncture?) !V1 # the continuant consonants and the syllabic pairs they can form continuant <- [mnlrMNLR] syllabic <- (([mM] [mM] !(juncture? [mM]))/([nN] [nN] !(juncture? [nN]))/([rR] [rR] !(juncture? [rR]))/([lL] [lL] !(juncture? [lL]))) # the obligatory monosyllables, and these syllables when broken by a usually bad syllable juncture. # The i-final forms are not obligatory mono when followed by another i. MustMono <- (([aeoAEO] [iI] ![iI]) /([aA] [oO])) BrokenMono <- (([aeoAEO] juncture [iI] ![iI])/([aA] juncture [oO])) # the obligatory and optional monosyllables. Sequences of three of the same letter # are averted. Avoid formation of doubled i or u after ui or ui. Mono <- (MustMono/([iI] !([uU] [uU]) V2)/([uU] !([iI] [iI]) V2)) # vowel pairs of the form found in cmapua and djifoa. # (other than the special IY, UY covered in the cmapua rules) VV <- (!BrokenMono V2 juncture? V2) # the next vocalic unit to be chosen from a stream of vowels # in a predicate or name. This is different than in our Sources # and formally described in the proposal. NextVowels <- (MustMono/(V2 &MustMono)/Mono/V2) # the doubled vowels that trigger the rule that one of them must be stressed DoubleVowel <- (([aA] juncture? [aA])/([eE] juncture? [eE])/([oO] juncture? [oO])/([iI] juncture [iI])/([uU] juncture [uU])/[iI] [Ii] &[iI]/[Uu] [uU] &[uU]) # the mandatory "vowel" component of a syllable Vocalic <- (NextVowels/syllabic/[y]) # the permissible initial pairs of consonants, and the same pairs possibly # broken by syllable junctures. Initial <- (([Bb] [Ll])/([Bb] [Rr])/([Cc] [Kk])/([Cc] [Ll])/([Cc] [Mm])/([Cc] [Nn])/([Cc] [Pp])/([Cc] [Rr])/([Cc] [Tt])/([Dd] [Jj])/([Dd] [Rr])/([Dd] [Zz])/([Ff] [Ll])/([Ff] [Rr])/([Gg] [Ll])/([Gg] [Rr])/([Jj] [Mm])/([Kk] [Ll])/([Kk] [Rr])/([Mm] [Rr])/([Pp] [Ll])/([Pp] [Rr])/([Ss] [Kk])/([Ss] [Ll])/([Ss] [Mm]) /[Ss] [Nn]/([Ss] [Pp])/([Ss] [Rr])/([Ss] [Tt])/([Ss] [Vv])/([Tt] [Cc])/([Tt] [Rr])/([Tt] [Ss])/([Vv] [Ll])/([Vv] [Rr])/([Zz] [Bb])/([Zz] [Ll])/([Zz] [Vv])) MaybeInitial <- (([Bb] juncture? [Ll])/([Bb]juncture? [Rr])/([Cc]juncture? [Kk])/([Cc] juncture? [Ll])/([Cc]juncture? [Mm])/([Cc]juncture? [Nn])/([Cc]juncture? [Pp])/([Cc]juncture? [Rr])/([Cc]juncture? [Tt])/([Dd]juncture? [Jj])/([Dd]juncture? [Rr])/([Dd]juncture? [Zz])/([Ff]juncture? [Ll])/([Ff]juncture? [Rr])/([Gg]juncture? [Ll])/([Gg]juncture? [Rr])/([Jj]juncture? [Mm])/([Kk]juncture? [Ll])/([Kk] juncture? [Rr])/([Mm]juncture? [Rr])/([Pp]juncture? [Ll])/([Pp]juncture? [Rr])/([Ss]juncture? [Kk])/([Ss]juncture? [Ll])/([Ss] juncture? [Mm]) /[Ss] juncture? [Nn]/([Ss]juncture? [Pp])/([Ss]juncture? [Rr])/([Ss]juncture? [Tt])/([Ss]juncture? [Vv])/([Tt]juncture? [Cc])/([Tt]juncture? [Rr])/([Tt] juncture? [Ss])/([Vv]juncture? [Ll])/([Vv]juncture? [Rr])/([Zz]juncture? [Bb])/([Zz] juncture? [Ll])/([Zz] juncture? [Vv])) # the permissible initial consonant groups in a syllable. Adjacent consonants should be initial pairs. # The group should not overlap a syllabic pair. Such a group is of course followed by a vocalic unit. # this rule for initial consonant groups is stated in NB3. # I forbid a three-consonant initial group to be followed by a syllabic pair. This seems obvious. InitialConsonants <- ((!syllabic C1 &Vocalic)/(!(C1 syllabic) Initial &Vocalic)/(&Initial C1 !(C1 syllabic) Initial !syllabic &Vocalic)) # the forbidden medial pairs and triples. These are forbidden regardless of placement # of syllable breaks. # each of these is actually a single consonant followed by an initial, and the idea was to identify CVC-CCV junctions which # would be hard to pronounce. But the placement of the syllable break is not relevant to the exclusion of the sequence. # Notice that the continuant syllabic pairs are excluded: this prevents final consonants from being included in such pairs. NoMedial2 <- (([Bb] juncture? [Bb])/([Cc] juncture? [Cc])/([Dd] juncture? [Dd])/([Ff] juncture? [Ff])/([Gg] juncture? [Gg])/([Hh] juncture? C1)/([Jj] juncture? [Jj])/([Kk] juncture? [Kk])/([Ll] juncture? [Ll])/([Mm] juncture? [Mm])/([Nn] juncture? [Nn])/([Pp] juncture? [Pp])/([Rr] juncture? [Rr])/([Ss] juncture? [Ss])/([Tt] juncture? [Tt])/([Vv] juncture? [Vv])/([Zz] juncture? [Zz])/([CJSZcjsz] juncture? [CJSZcjsz])/([Ff] juncture? [Vv])/([Kk] juncture? [Gg])/([Pp] juncture? [Bb])/([Tt] juncture? [Dd])/([FKPTfkpt] juncture? [JZjz])/([Bb] juncture? [Jj])/([Ss] juncture? [Bb])) NoMedial3 <- (([Cc] juncture? [Dd] juncture? [Zz])/([Cc] juncture? [Vv] juncture? [Ll])/([Nn] juncture? [Dd] juncture? [Jj])/([Nn] juncture? [Dd] juncture? [Zz])/([Dd] juncture? [Cc] juncture? [Mm])/([Dd] juncture? [Cc] juncture? [Tt])/([Dd] juncture? [Tt] juncture? [Ss])/([Pp] juncture? [Dd] juncture? [Zz])/([Gg] juncture? [Tt] juncture? [Ss])/([Gg] juncture? [Zz] juncture? [Bb])/([Ss] juncture? [Vv] juncture? [Ll])/([Jj] juncture? [Dd] juncture? [Jj])/([Jj] juncture? [Tt] juncture? [Cc])/([Jj] juncture? [Tt] juncture? [Ss])/([Jj] juncture? [Vv] juncture? [Rr])/([Tt] juncture? [Vv] juncture? [Ll])/([Kk] juncture? [Dd] juncture? [Zz])/([Vv] juncture? [Tt] juncture? [Ss])/([Mm] juncture? [Zz] juncture? [Bb])) # The syllable. # there are no formal rules about syllables as such in our Sources, which is odd since # the definition of predicates depends on the placement of stresses on syllables. # The first rule enforces the special point needed in complexes that # a CVC syllable is preferred to a CV syllable where possible; we economically apply # the same rule for default placement of syllable breaks everywhere, which is, with # that exception, that the break comes as soon as possible. # the SyllableB approach is taken if the following syllable would otherwise start with a syllabic pair. # the reason for this approach is that if one syllabizes a well formed complex in this way... # the syllable breaks magically fall on the djifoa boundaries. This does mean that the # default break in is , which feels funny but is harmless. Explicitly breaking # it will also parse correctly. SyllableA <- (C1 V2 FinalConsonant (!Syllable FinalConsonant)?) !syllabic SyllableB <- (InitialConsonants? Vocalic (!Syllable FinalConsonant)? (!Syllable FinalConsonant)?) Syllable <- ((SyllableA/SyllableB) juncture?) # The final consonant in a syllable. There may be one or two final consonants. A pair of final # consonants may not be a non-continuant followed by a continuant. A final consonant may not # start a forbidden medial pair or triple. # The rule that a final consonant pair may not be a non-continuant followed by a continuant # is natural and obvious but not in our Sources. Such a pair of consonants would seem to # naturally form another syllable. FinalConsonant <- !syllabic (!(!continuant C1 !Syllable continuant) !NoMedial2 !NoMedial3 C1 !(juncture? V2)) # Here are various flavors of syllable we may need. # this is a portmanteau definition of a bad syllable (the sort not allowed in a borrowing). SyllableD <- &(InitialConsonants? ([y]/DoubleVowel/BrokenMono/&Mono V2 DoubleVowel/!MustMono &Mono V2 BrokenMono)) Syllable # this (below) is the kind of syllable which can exist in a borrowed predicate: # it cannot start with a continuant pair, it cannot have a y as vocalic unit, # and its vocalic unit (whether it has one or two regular vowels) # cannot be involved in a double vowel or an explicitly broken # mandatory monosyllable. BorrowingSyllable <- !syllabic (!SyllableD) Syllable # this is the final syllable of a predicate. It cannot be followed # without pause by a regular vowel. VowelFinal <- InitialConsonants? Vocalic juncture? !V2 # syllables with syllabic consonant vocalic units # this class is only used in borrowings, and we *could* reasonably # require it to be followed by a vowel. But I won't for now. # for gluing this restriction would work, but we might literally borrow predicates # with syllabic continuant pronunciations. SyllableC <- (&(InitialConsonants? syllabic) Syllable) # syllables with y SyllableY <- (&(InitialConsonants? [y]) Syllable) # an explicitly stressed syllable. StressedSyllable <- ((SyllableA/SyllableB) [\'*]) # a final syllable in a word, ending in a consonant. NameEndSyllable <- (InitialConsonants? (syllabic/Vocalic &FinalConsonant) FinalConsonant? FinalConsonant? stress? !letter) # the pause classes actually hang on the letter before the pause. # whitespace which might or might not be a pause. maybepause <- (V1 [\'*]? [ ]+ C1) # explicit pauses: these are whitespace before a vowel or after a consonant, or comma marked pauses. pause <- ((C1 [\'*]? [ ]+ &letter)/(letter [\'*]? [ ]+ &V1)/(letter [\'*]? [,] [ ]+ &letter)) # these are final syllables in words followed by whitespace which might not be a pause. # the definition actually doesnt mention the maybepause class. MaybePauseSyllable <- InitialConsonants? Vocalic ['*]? &([ ]+ &C1) # The full analysis of names. # a name word (without initial marking) is resolvable into syllables and ends with a consonant. PreName <- ((Syllable &Syllable)* NameEndSyllable) # this is a busted name word with whitespace in it -- but not whitespace at which one has to pause. BadPreName <- (MaybePauseSyllable [ ]+/Syllable &Syllable)* NameEndSyllable # This is a name marker followed by a consonant initial name word without pause. # I deployed a minimal set of name marker words; I can add the others whenever. # I have decided (see below) to retain the social lubrication words as vocative markers # *without* making them name markers, so one must pause . By not allowing # freemods right after vocative markers in the vocative rule, I make work as well, # without pause. MarkedName <- &caprule ((([Ll] !pause [Aa] juncture?)/ ([Hh] [Oo] !pause [Ii] juncture?) / ([Hh] [Uu] juncture? !pause [Ee] juncture?) / ([Cc] !pause [Ii] juncture?)/([Ll] [Ii] juncture? !pause [Uu] juncture?)/[Gg][Aa] !pause [Oo] juncture?/[Mm][Uu] juncture? !pause [Ee] juncture?) [ ]* &C1 &caprule PreName) # This is an unmarked name word with a false name marker in it. FalseMarked <- (&PreName (!MarkedName character)* MarkedName) # This is the full definition of name words. These are either marked consonant initial names without pause defined above, # names without false name markers beginning with explicit pauses (either comma marked or vowel-initial) # and name markers followed, with or without pause, by name words. In the latter case there must be at least # whitespace before a vowel initial name. # a series of names without false name markers and names marked with ci, separated by spaces, may be appended. # there is a look ahead at the grammar: a NameWord can be followed without explicit pause (there is whitespace and # a pause in speech!) by another # kind of utterance only in a serial name when what follows is of the form predunit, to be included # in the name. NameWord <- (&caprule MarkedName/([,] [ ]+ !FalseMarked &caprule PreName)/(&V1 !FalseMarked &caprule PreName)/&caprule ((([Ll] [Aa] juncture?)/([Hh] [Oo] [Ii] juncture?)/([Cc] [Ii] juncture?)/([Ll] [Ii] juncture? [Uu] juncture?)/[Mm] [Uu] juncture? [Ee] juncture?/[Gg] [Aa] [Oo] juncture?) !V1 [,]? [ ]* &caprule PreName))([,]?[ ]+ !FalseMarked &caprule PreName/[,]?[ ]+ &([Cc] [Ii]) NameWord)* &([ ]* [Cc] [Ii] predunit/&([,] [ ]+/terminal/[\")]/!.)./!.) # this is the minimal set of name marker words we are using. We may add more. # I am contemplating adding the words of social lubrication as name markers, but in a more restricted # way that in the last provisional parser, in which I made them full-fledged vocative markers. [Actually, # I preserved their status as vocative markers without restoring their status as name markers, in the latest version]. # adding as a name marker namemarker <- ([Ll] [Aa] juncture?/[Hh][Oo][Ii] juncture?/([Hh] [Uu] juncture? [Ee] juncture?)/[Cc][Ii] juncture?/[Ll][Ii] juncture? [Uu] juncture?/[Gg][Aa][Oo] juncture?/[Mm] [Uu] juncture? [Ee] juncture?) !V1 # this is the bad name marker phenomenon that needs to be excluded. This captures the idea # that what follows the name could be pronounced without pause as a name word according to the # orthography, but the fact that whitespace is present shows that this is not the intention. # it is worth noting that name markers at heads of name words pass this test # (because I omitted the test that what follows is not a PreName in the interests # of minimizing lookahead); # but this test is only applied to strings that have already been determined not to # be of class NameWord. badnamemarker <- namemarker !V1 [, ]? [ ]* BadPreName # we test for the bad name marker condition at the beginning of each stream of cmapua, # and streams of cmapua stop before name markers (and may resume at a name marker # if neither a NameWord nor the bad marker condition is found). # We have at any rate completely solved the phonetic problem of names and their markers. # predicate start tests: the idea is the same as class "connective" above, to recognize # the start of a predicate without recursive appeals to the whole nasty definition of predicate. # The reason to do it is to recognize when CV^n followed by CC cannot be a cmapua unit. # we note, though we have taken no action that # (1) we don't see any reason for there to be any predicates starting CVVVV..CC; there is at least one starting CVVV # (2) actually, any CVVVV...CC segment can *only* start a predicate, if one takes all rules into account. But # since this is incredible we haven't worried about it. # If we forbade CVVVV... to start a predicate this would sharply bound the lookahead here. # However, CVVVV... never occurs in legal Loglan text. # not all of the things identified are actually starts of predicates: some are simply illegal. # these tests are designed to ensure that an initial CV^n in a predicate is recognized # by the rule scanning for cmapua. # if we encounter a CC which is not initial the (C)V^n before it is part of the predicate. # working on a shorter set of predstart tests. Where one of these tests # is passed, either a predicate starts here or the string is illegal. # A. CV^nC^n where the C^n cannot start a predicate. This includes the condition that # the first vocalic unit in a predicate cannot be syllabic. predstartA <- C1? (V2 juncture?)+ !(InitialConsonants !syllabic Vocalic) C1 juncture? (C1 juncture?)+ # formulated a universal case B to handle all the djifoa with y's in them predstartB <- C1 (V2 juncture?) ![y] (V2 juncture?)* (C1 juncture?)* [y] # C. CV^nCC with initial syllable stressed predstartC <- &StressedSyllable C1? (Mono/V2) juncture? C1 juncture? C1 # cases D and E implement the point that CCVV predicates are forbidden. # D. CV^nC^nV^1 or 2 !char predstartD <- C1? (V2 juncture?)+ &Initial InitialConsonants V2 juncture? (V2 juncture?)? !character # E. CV^nV2(stressed) predstartE <- C1? (V2 juncture?)+ &Initial InitialConsonants V2 stress !Mono V2 # F. special situation where Cvv-V or V can fall off the front. A V or Cvv-V cmapua unit or a stream of VV units # (but not a CVV unit) may be followed without pause by ccV. # I changed this rule so as not to entirely arbitrarily halve the improbable and so far unused domain of possible # borrowings starting with a consonant followed by many many vowels. Probably these should be banned, but this is not # the reason to do it :-) predstartF <- ((C1 Mono juncture? V2)/(VV juncture?)+ /V2) [-]? MaybeInitial V2 # and here at last is predstart. This function does lookahead to a consonant pair and decides # whether what it passes over is an initial segment of a predicate. This is actually used # as a test on the predicate classes, and it is also used to recognize where a cmapua must end. predstart <- !predstartF (predstartA/predstartB/predstartC/predstartD/predstartE/MaybeInitial) # it is worth noting that in the sequel we have systematically replaced tests &Cmapua # with !predstart. The former involves lots of lookahead and was causing recursion crashes # in Python. The phonetics and the grammar are both structured so that any string # starting with a name marker is tested for NameWord-hood before it is tested for # cmapua-hood; the only thing it is tested for later is predicate-hood, and predstart # is a rough and ready test that something might be a predicate (and at any rate # cannot be a cmapua). # this class requires pauses before it, after all the phonetic word classes. # what is being recognized is the beginning of a logical connective. # To avoid horrible recursion problems, giving this a concrete phonetic definition # without much lookahead. This can go right up in the phonetics section if it works # (and here it is!). connective <- [ ]* !predstart ([Nn] [Uu] juncture? &([Uu]/[Nn]))? !predstart ([Nn] [Oo] juncture?)? V2 juncture? !V2 !([Ff] [Ii]) !([Mm] [Aa]) !([Zz] [Ii]) # cmapua units starting with consonants. This is the exact description from NB3. The fancy tail in each of the # three cases is enforcing the rule about pausing before a following predicate if stressed. # consonant initial cmapua units may not be followed by vowels without pause. # I am adding and (always monosyllable, yuh and wuh) as vowel pairs permitted in VV and CVV cmapua units. # it is worth noting that the "yuh" and "wuh" pronunciations of these diphthongs # are surprising to the English-reading eye. # The use for this envisaged is that the name of Y becomes easy to introduce. Adding word space # is always nice, and these words seem pronounceable. I also made possible: Y now has phonetically # regular names. CmapuaUnit <- (C1 Mono juncture? V2 !(['*] [ ]* &C1 predstart) juncture? !V1/C1 (VV/[Ii][Yy]/[Uu][Yy]) !(['*] [ ]* &C1 predstart) juncture? !V1/C1 V2 !(['*] [ ]* &C1 predstart) juncture? !V1) # A stream of cmapua is read until the start of a predicate or a name marker word or an alien text marker word or a quote or parenthesis marker word is encountered. # the stream might resume with a name marker word if it does not in fact start a name word and does not potentially start a name # word due to inexplicit whitespace (doesn't satisfy the bad name marker condition). # we force explicit comma pauses before logical connectives, but not before vowel initial cmapua in general; # other conditions force at least whitespace, which does stand for a pause, before such words. # detect starts of quotes or parentheses with
  • or likie <- ([Ll] [Ii] juncture? !V1/[Ki] [Ii] juncture? [Ee] juncture? !V1) # a special provision is made for NO UI forms as single words. is supported. Cmapua <- &caprule !badnamemarker (!predstart [Nn] [Oo] juncture? !predstart (VV/[Ii][Yy]/[Uu][Yy]) !(['*] [ ]* predstart) juncture?/((!predstart (VV/[Ii][Yy]/[Uu][Yy]) !(['*] [ ]* &C1 predstart) juncture?)+ / ((!predstart V1 !(['*] [ ]* &C1 predstart) juncture?)/ !predstart CmapuaUnit) (!namemarker !alienmarker !likie !predstart CmapuaUnit)*)/!predstart V2 !(['*] [ ]* &C1 predstart) juncture?) !V1 !([ ]* (connective)) # I have apparently now completely solved the problem of parsing cmapua as well as name words. # Now for predicates. # the elementary djifoa (not borrowings) # various special flavors of these djifoa will be needed. # These are the general definitions. # The NOY and Bad forms are for use for testing candidate borrowings for resolution # with bad syllable break placements. Borrowings do not contain Y... # CVV djifoa with phonetic hyphens. # added checks to all cmapua classes: the vowel final ones, when not phonetically hyphenated, cannot # be followed by a regular vowel. This is crucial for getting the syllable analysis and the djifoa # analysis to end at the same point. CVV <- C1 VV (juncture? [y] [-]? &Complex /[r] juncture? &C1/[n] juncture? &[r]/juncture? !V2) CVVNoHyphen <- C1 VV juncture? !V2 CVVHiddenStress <- C1 &DoubleVowel V1 [-]? V1 ([-]? [y] [-]? &Complex /[r] [-]? &C1/[n] [-]? &[r]/[-]? !V2) CVVFinalStress <- C1 VV (['*] [y] [-]? &Complex /[r] ['*] &C1/[n] ['*] &[r]/['*] !V2) CVVNOY <- C1 VV ([r] juncture? &C1/[n] juncture? &[r]/juncture? !V2) CVVNOYFinalStress <- C1 VV ([r] ['*] &C1/[n] ['*] &[r]/['*] !V2) CVVNOYMedialStress <- C1 !BrokenMono V2 ['*] V2 [-]? !V2 # CCV djifoa with phonetic hyphens. CCV <- Initial V2 (juncture? [y] [-]? &letter/juncture? !V2) CCVStressed <- Initial V2 (['*] [y] [-]? &letter/['*] !V2) CCVNOY <- Initial V2 juncture? !V2 CCVBad <- MaybeInitial V2 juncture? !V2 CCVBadStressed <- MaybeInitial V2 ['*] !V2 # CVC djifoa with phonetic hyphens. These cannot be final and are always followed by a consonant (well, the # -y form may be followed by a vowel... # an eccentric syllable break is supported if the CVC is y-hyphenated: # and are both legal. The default is the latter. CVC <- (C1 V2 !NoMedial2 !NoMedial3 C1 (juncture? [y] [-]? &letter/juncture? &C1)/C1 V2 juncture C1 [y] [-]? &letter) CVCStressed <- (C1 V2 !NoMedial2 !NoMedial3 C1 (['*] [y] [-]? &letter/['*] &letter)/C1 V2 ['*] C1 [y] [-]? &letter) CVCNOY <- C1 V2 !NoMedial2 !NoMedial3 C1 juncture? &C1 CVCBad <- C1 V2 !NoMedial2 !NoMedial3 juncture? C1 &C1 CVCNOYStressed <- C1 V2 !NoMedial2 !NoMedial3 C1 ['*] &C1 CVCBadStressed <- C1 V2 !NoMedial2 !NoMedial3 ['*] C1 &C1 # the five letter forms (always final in complexes) CCVCV <- Initial V2 juncture? C1 V2 [-]? !V2 CCVCVStressed <- Initial V2 ['*] C1 V2 [-]? !V2 CCVCVBad <- MaybeInitial V2 juncture? C1 V2 [-]? !V2 CCVCVBadStressed <- MaybeInitial V2 ['*] C1 V2 [-]? !V2 CVCCV <- (C1 V2 juncture? Initial V2 [-]? !V2/C1 V2 !NoMedial2 C1 juncture? C1 V2 [-]? !V2) CVCCVStressed <- (C1 V2 ['*] Initial V2 [-]? !V2/C1 V2 !NoMedial2 C1 ['*] C1 V2 [-]? !V2) # the medial five letter djifoa CCVCY <- Initial V2 juncture? C1 [y] [-]? CVCCY <- (C1 V2 juncture? Initial [y] [-]?/C1 V2 !NoMedial2 C1 juncture? C1 [y] [-]?) CCVCYStressed <- Initial V2 ['*] C1 [y] [-]? CVCCYStressed <- (C1 V2 ['*] Initial [y] [-]?/C1 V2 !NoMedial2 C1 ['*] C1 [y] [-]?) # to reason about resolution of borrowings into both syllables and djifoa (we want to exclude the latter # but we need to define it adequately) we need to recognize where to stop. A predicate word ends either # at a non-character (not a letter or syllable mark: whitespace, comma or terminal punctuation) or it # has an explicit or deducible penultimate stress. Borrowings do not contain doubled vowels, so they # have to have explicit stress in the latter case. # analysis: the stressed tail consists of a stressed syllable followed by an unstressed syllable. # identifying an unstressed final syllable is complicated by recognizing which CVV combinations can # be one syllable. This will either be an explicitly stressed syllable followed by a single syllable # or a syllable suitable to be stressed followed by an explicitly final syllable. CVV djifoa can # contain both syllables in a tail and of course the five letter djifoa have to be tails. A never stressed # SyllableC (with a continuant) may intervene. # tail of a borrowing with an explicit stress BorrowingTail1 <- !SyllableC &StressedSyllable BorrowingSyllable (!StressedSyllable &SyllableC BorrowingSyllable)? !StressedSyllable &BorrowingSyllable VowelFinal # tail of a borrowing or borrowing djifoa with no explicit stress BorrowingTail2 <- !SyllableC BorrowingSyllable (!StressedSyllable &SyllableC BorrowingSyllable)? !StressedSyllable &BorrowingSyllable VowelFinal (&[y]/!character) # tail of a stressed borrowing djifoa, different because stress is shifted to the end BorrowingTail3 <- !SyllableC !StressedSyllable BorrowingSyllable (!StressedSyllable &SyllableC BorrowingSyllable)? &BorrowingSyllable InitialConsonants? Vocalic ['*] &[y] BorrowingTail <- BorrowingTail1 / BorrowingTail2 # short forms that are ruled out: CCVV and CCCVV forms. CCVV <- (InitialConsonants V2 juncture? V2 juncture? !character / InitialConsonants V2 ['*] !Mono V2 juncture?) # VCCV and some related forms are ruled out (rule predstartF above is about this) # a continuant syllable cannot be initial in a borrowing and there cannot be successive continuant # syllables. There really ought to be no more than one! # borrowing, before checking that it doesnt resolve into djifoa PreBorrowing <- &predstart!CCVV!Cmapua!SyllableC(!BorrowingTail!(StressedSyllable)!(SyllableC SyllableC)BorrowingSyllable)* BorrowingTail # ditto for an explicitly stressed borrowing StressedPreBorrowing <- &predstart!CCVV!Cmapua!SyllableC(!BorrowingTail!(StressedSyllable)!(SyllableC SyllableC)BorrowingSyllable)* BorrowingTail1 # borrowing djifoa without explicit stress (before resolution check) PreBorrowing2 <- &predstart!CCVV!Cmapua!SyllableC(!BorrowingTail!(StressedSyllable)!(SyllableC SyllableC)BorrowingSyllable)* BorrowingTail2 # stressed borrowing djifoa (before resolution check). PreBorrowing3 <- &predstart!CCVV!Cmapua!SyllableC(!BorrowingTail3!(StressedSyllable)!(SyllableC SyllableC)BorrowingSyllable)* BorrowingTail3 # Now comes the problem of trying to say that a preborrowing cannot resolve into cmapua. The difficulty is with # recognizing the tail, so making sure that the two resolutions stop in the same place. # we know because it is a borrowing that there is at most one explicit stress, and it has to fall # in one of the cmapua! This should make it doable. # borrowing djifoa are terminated with y, so the final djifoa needs to take this into account # the idea behind both djifoa analyses is the same. If we end with a final djifoa followed by # a non-character, we improve our chances of ending the syllable analysis at the same point. We control # this by identifying djifoa with stresses in them: a medially stressed djifoa must be the last one # (and the syllable analysis will find its stressed syllable and end at its final syllable, the fact # that djifoa cannot be followed by vowels ensuring that the syllable analysis cannot overrun its end. # When the djifoa is finally stressed, the complex analysis ends with a further djifoa guaranteed to have # just one syllable, and the syllable analysis again will stop in the same place. The medial five letter forms # and borrowing djifoa of course are finally stressed mod an additional unstressed syllable which is skipped # by the syllable analysis, because it allows one to ignore an actually penultimate syllable with y or # a syllabic consonant. In the case where we never find a stress and end up at a final djifoa, the syllable # analysis will carry right through to the same final point. # in the attempted resolution of borrowings, our life is easier because we do not have # borrowing djifoa or medial five letter forms to consider, or any forms with y-hyphens. RFinalDjifoa <- (CCVCVBad/CVCCV/CVVNoHyphen/CCVBad/CVCBad) (&[y]/!character) RMediallyStressed <- (CCVCVBadStressed/CVCCVStressed/CVVNOYMedialStress) RFinallyStressed <- (CVVNOYFinalStress/CCVBadStressed/CVCBadStressed/CVCNOYStressed) BorrowingComplexTail <- (RMediallyStressed/RFinallyStressed (&(C1 Mono) CVVNoHyphen/CCVBad)/RFinalDjifoa) ResolvedBorrowing <- (!BorrowingComplexTail(CVVNOY/CCVBad/CVCBad))* BorrowingComplexTail # borrowed predicates Borrowing <- !ResolvedBorrowing &caprule PreBorrowing !([ ]* (connective)) # explicitly stressed borrowed predicates StressedBorrowing <- !ResolvedBorrowing &caprule StressedPreBorrowing !([ ]* &V1 Cmapua) #This is the shape of non-final borrowing djifoa. Notice that a final stress is allowed. #The curious provision for explicitly stressing a borrowing djifoa and pausing is supported. # borrowing djifoa without explicit stress (stressed ones are not of this class!) # Note that one can pause after these (explicitly, with a comma) BorrowingDjifoa <- !ResolvedBorrowing &caprule PreBorrowing2 (['*] y [,] [ ]+/juncture? [y] [-]?) # stressed borrowing djifoa finally implemented! StressedBorrowingDjifoa <- !ResolvedBorrowing &caprule PreBorrowing3 [y] [-]? ([,] [ ]+)? # We resolve complexes twice, once into syllables and once into djifoa. We again have to ensure that # we end up in the same place! The syllable resolution is very similar to that of borrowings; # the unstressed middle syllable of the tail can be a SyllableY, and can also be a # SyllableC if the final djifoa is a borrowing. # A stressed borrowing djifoa with the property that the tail is still a phonetic complex is # a unit for this analysis. # note here that I specifically rule out a complex being followed without pause by y. I do not rule # this out for the vowel final djifoa because they can be followed by y at the end of a borrowing # djifoa. PhoneticComplexTail1 <- !SyllableC !SyllableY &StressedSyllable Syllable (!StressedSyllable &(SyllableC/SyllableY) Syllable)? !StressedSyllable !SyllableY VowelFinal !V1 PhoneticComplexTail2 <- !SyllableC !SyllableY Syllable (!StressedSyllable &(SyllableC/SyllableY) Syllable)? !StressedSyllable !SyllableY VowelFinal !character PhoneticComplexTail <- PhoneticComplexTail1 / PhoneticComplexTail2 # note the explicit predstart test here. PhoneticComplex <- &predstart!CCVV!Cmapua!SyllableC(StressedBorrowingDjifoa &PhoneticComplex/!PhoneticComplexTail!(StressedSyllable)!(SyllableC SyllableC)Syllable)* PhoneticComplexTail # the analysis of final djifoa and stressed djifoa differs only in details from # what is above for resolution of borrowings. The issues about CVV djifoa with doubled # vowels are rather exciting. # a stressed borrowing djifoa with the tail still a phonetic complex is a black box unit for # this construction. # My approach imposes the restriction on JCB's "pause after a borrowing djifoa" idea that what follows # the pause must itself contain a penultimate stress: is a predicate but is not. # while is a predicate. # the analysis of the djifoa resolution process is the same as above, with additional remarks # about doubled vowel syllables: notice that where the complex tail involved a doubled vowel syllable # without explicit stress, we insist on that djifoa or the single-syllable next djifoa ending in # a non-character: in the absence of explicit stress, we always rely on whitespace or punctuation # to indicate the end of the predicate. # all sorts of subtleties about borrowings and borrowing djifoa are finessed by always looking for # them first. There are no restrictions re fronts of borrowings or borrowing djifoa looking like regular # djifoa; the fact that borrowing djifoa end in y and borrowings do not contain y makes it always # possible to tell when one is looking at the head of a borrowing djifoa. Regular djifoa just before a borrowing # djifoa need to be y-hyphenated so as not to be absorbed into the front of the borrowing (I don't believe # that I actually need to impose a formal rule to this effect, though I am not absolutely certain; it would # be difficult to formulate [and does appear in the previous version, where it is a truly unintelligible piece # of PEG code]). FinalDjifoa <- (Borrowing/CCVCV/CVCCV/CVVNoHyphen/CCVNOY) !character MediallyStressed <- (StressedBorrowing/CCVCVStressed/CVCCVStressed/CVVNOYMedialStress) FinallyStressed <-(StressedBorrowingDjifoa/CCVCYStressed/CVCCYStressed/CVVFinalStress/CCVStressed/CVCStressed) ComplexTail <- (CVVHiddenStress (&(C1 Mono) CVVNoHyphen/CCVNOY) !character/FinallyStressed (&(C1 Mono) CVVNoHyphen/CCVNOY)/MediallyStressed/FinalDjifoa) PreComplex <- (!CVVHiddenStress (!ComplexTail)(StressedBorrowingDjifoa &PhoneticComplex/BorrowingDjifoa/CVCCY/CCVCY/CVV/CCV/CVC))* ComplexTail # originally I had complicated tests here for the conditions under which an initial # CVC cmapua has to be y-hyphenated: I was being wrong headed, the predstart rules # already enforce this (in the bad cases, the initial CV- falls off). The user will # simply find that they cannot put the word together otherwise. The previous version # did need this test because it actually used full lookahead to check for the start of a predicate. Complex <- &caprule &PreComplex PhoneticComplex !([ ]* (connective)) # format for the LI quote and KIE parenthesis LiQuote <- (&caprule [Ll][Ii]juncture? comma2? [\"] phoneticutterance [\"] comma2? &caprule [Ll][Uu]juncture? !([ ]* connective)/(&caprule [Kk][Ii]juncture?[Ee]juncture? comma2? [(] phoneticutterance [)] comma2? &caprule [Kk][Ii]juncture?[Uu]juncture? !([ ]* connective))) # the condition on Word that a Cmapua is not followed by another Cmapua # with mere whitespace between was used by quotation, but is now redundant, # because I have required that quotations be closed with explicit pauses in all cases. Word <- (NameWord / Cmapua !([ ]*Cmapua)/ Complex) # it is an odd point that all borrowings parse as complexes -- so when I parsed all the words the first time they all # parsed as complexes. A borrowing is a complex consisting of a single final borrowing djifoa! # I did redesign this so that borrowings are parsed as borrowings. (This is the class # I used to parse the dictionary). # Yes, CVC djifoa do get parsed as names in the dictionary, so the CVC case here is redundant. I actually # think that only the CCV djifoa actually get parsed as such. SingleWord <- (Borrowing !./Complex !./ Word !./PreName !. /CVV/CCV/CVC) !. # name word appearing initially without leading spaces is important, because one type of NameWord includes a leading comma. phoneticutterance1 <- (NameWord /[ ]* LiQuote/[ ]* NameWord/[ ]* AlienWord/[ ]*Cmapua/[ ]* '--'/[ ]* '...'/[ ]* Borrowing![y]/[ ]* Complex)+ phoneticutterance <- (phoneticutterance1/[,][ ]+/terminal)+ # consonants and vowel groups in cmapua # as noted above, !predstart stands in for the computationally disastrous &Cmapua badstress <- ['*] [ ]* &C1 predstart B <- (!predstart [Bb]) C <- (!predstart [Cc]) D <- (!predstart [Dd]) F <- (!predstart [Ff]) G <- (!predstart [Gg]) H <- (!predstart [Hh]) J <- (!predstart [Jj]) K <- (!predstart [Kk]) L <- (!predstart [Ll]) M <- (!predstart [Mm]) N <- (!predstart [Nn]) P <- (!predstart [Pp]) R <- (!predstart [Rr]) S <- (!predstart [Ss]) T <- (!predstart [Tt]) V <- (!predstart [Vv]) Z <- (!predstart [Zz]) # the monosyllabic classes may be followed by one vowel # if they start a Cvv-V cmapua unit; the others may never # be followed by vowels. Classes ending in -b are # used in Cvv-V cmapua units. a <- ([Aa] !badstress juncture? !V1) e <- ([Ee] !badstress juncture? !V1) i <- ([Ii] !badstress juncture? !V1) o <- ([Oo] !badstress juncture? !V1) u <- ([Uu] !badstress juncture? !V1) V3 <- juncture? V2 !badstress AA <- ([Aa] juncture? [Aa] !badstress juncture? !V1) AE <- ([Aa] juncture? [Ee] !badstress juncture? !V1) AI <- ([Aa] [Ii] !badstress juncture? !(V1)) AO <- ([Aa] [Oo] !badstress juncture? !(V1)) AIb <- ([Aa] [Ii] !badstress juncture? &(V2 juncture? !V1)) AOb <- ([Aa] [Oo] !badstress juncture? &(V2 juncture? !V1)) AU <- ([Aa] juncture? [Uu] !badstress juncture? !V1) EA <- ([Ee] juncture? [Aa] !badstress juncture? !V1) EE <- ([Ee] juncture? [Ee] !badstress juncture? !V1) EI <- ([Ee] [Ii] !badstress juncture? !(V1)) EIb <- ([Ee] [Ii] !badstress juncture? &(V2 juncture? !V1)) EO <- ([Ee] juncture? [Oo] !badstress juncture? !V1) EU <- ([Ee] juncture? [Uu] !badstress juncture? !V1) IA <- ([Ii] juncture? [Aa] !badstress juncture? !(V1)) IE <- ([Ii] juncture? [Ee] !badstress juncture? !(V1)) II <- ([Ii] juncture? [Ii] !badstress juncture? !(V1)) IO <- ([Ii] juncture? [Oo] !badstress juncture? !(V1)) IU <- ([Ii] juncture? [Uu] !badstress juncture? !(V1)) IAb <- ([Ii] juncture? [Aa] !badstress juncture? &(V2 juncture? !V1)) IEb <- ([Ii] juncture? [Ee] !badstress juncture? &(V2 juncture? !V1)) IIb <- ([Ii] juncture? [Ii] !badstress juncture? &(V2 juncture? !V1)) IOb <- ([Ii] juncture? [Oo] !badstress juncture? &(V2 juncture? !V1)) IUb <- ([Ii] juncture? [Uu] !badstress juncture? &(V2 juncture? !V1)) OA <- ([Oo] juncture? [Aa] !badstress juncture? !V1) OE <- ([Oo] juncture? [Ee] !badstress juncture? !V1) OI <- ([Oo] [Ii] !badstress juncture? !(V1)) OIb <- ([Oo] [Ii] !badstress juncture? &(V2 juncture? !V1)) OO <- ([Oo] juncture? [Oo] !badstress juncture? !V1) OU <- ([Oo] juncture? [Uu] !badstress juncture? !V1) UA <- ([Uu] juncture? [Aa] !badstress juncture? !(V1)) UE <- ([Uu] juncture? [Ee] !badstress juncture? !(V1)) UI <- ([Uu] juncture? [Ii] !badstress juncture? !(V1)) UO <- ([Uu] juncture? [Oo] !badstress juncture? !(V1)) UU <- ([Uu] juncture? [Uu] !badstress juncture? !(V1)) UAb <- ([Uu] juncture? [Aa] !badstress juncture? &(V2 juncture? !V1)) UEb <- ([Uu] juncture? [Ee] !badstress juncture? &(V2 juncture? !V1)) UIb <- ([Uu] juncture? [Ii] !badstress juncture? &(V2 juncture? !V1)) UOb <- ([Uu] juncture? [Oo] !badstress juncture? &(V2 juncture? !V1)) UUb <- ([Uu] juncture? [Uu] !badstress juncture? &(V2 juncture? !V1)) # adding the new IY and UY, which might see use some time. # they are mandatory monosyllables but do not take a possible additional # following vowel as the regular ones do. So far only used in . IY <- [Ii] [Yy] !badstress juncture? !V1 UY <- [Uu] [Yy] !badstress juncture? !V1 # this is a pause not required by the phonetics. This is the only # sort of pause which could in principle carry semantic freight (the # pause/GU equivalence beloved of our Founder) but we have abandoned # this. There is one place, after initial in an utterance, where # a pause can have effect on the parse (but not on the meaning, I believe, # unless a word break is involved). # this class should NEVER be used in a context which might follow # a name word. In previous versions, pauses after name words were included # in the name word; this is not the case here, so a PAUSE # after a name word would not be recognized as a mandatory pause. # in any event, as long as we stay away from pause/GU equivalence, this # is not a serious issue! # this class does do some work in the handling of issues surrounding the legacy # shape of APA connectives, concerning which the less said, the better. PAUSE <- [,] [ ]+ !(V1/connective) &caprule # more punctuation comma <- [,] [ ]+ &caprule comma2 <- [,]? [ ]+ &caprule # Part II Lexicography # In this section I develop the grammar of words in Loglan. I'll work by editing the original provisional PEG grammar. # I place the start of this section exactly here, just before two final items of # punctuation, because these items of punctuation look forward not only to lexicography # but to the full grammar! # the end of utterance symbol <#> should be added in the phonetics # section as a species of terminal marker. Done. We do *not* actually # endorse use of this marker, but we can notionally support it and it is in # our sources. end <- (([ ]* '#' [ ]+ utterance)/([ ]+ !.)/!.) # this rule allows terminal punctuation to be followed by an inverse vocative, # a frequent occurrence in Leith's novel, and something which makes sense. period <- (([!.:;?] (&end/([ ]+ &caprule))) (invvoc period?)?) # Letters with y will be special cases # idea: allow IY and UY (always monosyllables) as vowel combinations in cmapua only. # done: Y has a name now. is also added. # the classes in this section after this point are the cmapua word classes of Loglan (if they begin with [ ]* or a word class). # I suppose the alien text classes are not really word classes, but they are lexicographic items, as it were. # Paradoxically, the PA and NI classes admit internal explicit pauses. So of course do predicate words! # Loglan does admit true multisyllable cmapua: there are words made of cmapua units which have joints between # units at which one cannot pause without breaking the word. Lojban, I am told, does not. # this version has the general feature that the quotation and alien text constructions are not hacked: # they are supported by the phonetic rules (as dire exceptions, of course) and the grammatical constructions # conform with the phonetic layer. Alien text and utterances quoted with
  • ... can be enclosed in double quotes. # LI only supports full utterances, for the moment. All alien text constructors take the same class as argument: # the vocative and inverse vocative *require* quotes to avoid misreading ungrammatical expressions with typos # as correct (inverse) vocatives. # the names , for Y are supported. The Ceo names are left as they are. I decided that a second short series # of letteral pronouns is actually a reasonable use of short words, and the Ceio words are there for other uses. TAI0 <- (V1 juncture? M a/V1 juncture? F i/V1 juncture? Z i/!predstart C1 AI/!predstart C1 EI/!predstart C1 AIb u/!predstart C1 EIb (u)/!predstart C1 EO/ Z [Ii] V1 !badstress juncture? !V1 (M a)?) # a negative suffix used in various contexts. Always a suffix: its use as a prefix in tenses was a mistake in NB3 and I # think still supported in LIP. Ambiguities demonstrably followed from this usage (an example of how the demonstration # of non-ambiguity of 1989 Loglan was compromised by the opaque lexicography). NOI <- (N OI) # the logical connectives. [A0] is the class of core logical connectives. [A] is the fully decorated logical connective with # possible nu- (always in nuno- or nuu) and no- prefixes, possible -noi suffix, and possible (problematic) PA suffix, closed # with -fi (our new proposal) or an explicit pause. A0 <- &Cmapua (a/e/o/u/H a/N UU) A <- [ ]* !predstart !TAI0 (N [o])? A0 NOI? !([ ]+ PANOPAUSES PAUSE) !(PANOPAUSES !PAUSE [ ,]) (PANOPAUSES ((F i)/&PAUSE))? # A not closed with -fi or a pause ANOFI <- [ ]* (!predstart !TAI0 ( (N [o])? A0 NOI? PANOPAUSES?)) A1 <- A # versions of A with different binding strength ACI <- (ANOFI C i) AGE <- (ANOFI G e) # a tightly binding series of logical connectives used to link predicates # this also includes the fusion connective when used between predicates. CA0 <- (( (N o)? ((C a)/(C e)/(C o)/(C u)/(Z e)/(C i H a)/N u C u)) NOI?) CA1 <- (CA0 !([ ]+ PANOPAUSES PAUSE) !(PANOPAUSES !PAUSE [ ,]) (PANOPAUSES ((F i)/&PAUSE))?) CA1NOFI <- (CA0 PANOPAUSES?) CA <- ([ ]* CA1) # the fusion connective when used in arguments ZE2 <- ([ ]* (Z e)) # sentence connectives. [I] is the class of utterance initiators (no logical definition). # the subsequent classes are inhabited by sentence logical connectives with various binding # strengths. I <- ([ ]* !predstart !TAI0 i !([ ]+ PANOPAUSES PAUSE) !(PANOPAUSES !PAUSE [ ,]) (PANOPAUSES ((F i)/&PAUSE))?) ICA <- ([ ]* i ((H a)/CA1)) ICI <- ([ ]* i CA1NOFI? C i) IGE <- ([ ]* i CA1NOFI? G e) # forethought logical connectives KA0 <- ((K a)/(K e)/(K o)/(K u)/(K i H a)/(N u K u)) # causal and comparative modifiers KOU <- ((K OU)/(M OI)/(R AU)/(S OA)/(M OU)/(C IU)) # negative and converse forms KOU1 <- (((N u N o)/(N u)/(N o)) KOU) # the full type of forethought connectives, adding the causal and comparative connectives KA <- ([ ]* ((KA0)/((KOU1/KOU) K i)) NOI?) # the last component of the KA...KI... structure of forethought connections KI <- ([ ]* (K i) NOI?) # causal and comparative modifiers which are *not* forethought connectives KOU2 <- (KOU1 !KI) # a test used to at least partially enforce the penultimate stress rule on quantifier predicates BadNIStress <- ((C1 V2 V2? stress (M a)? (M OA)? NI RA)/(C1 V2 stress V2 (M a)? (M OA)? NI RA)) # root quantity words, including the numerals NI0 <- (!BadNIStress ((K UA)/(G IE)/(G IU)/(H IE)/(H IU)/(K UE)/(N EA)/(N IO)/(P EA)/(P IO)/(S UU)/(S UA)/(T IA)/(Z OA)/(Z OO)/(H o)/(N i)/(N e)/(T o)/(T e)/(F o)/(F e)/(V o)/(V e)/(P i)/(R e)/(R u)/(S e)/(S o)/(H i))) # the class of SA roots, which modify quantifiers SA <- (!BadNIStress ((S a)/(S i)/(S u)/(IE (comma2? !IE SA)?)) NOI?) # the family of quantifiers which double as suffixes for the quantifier predicates # this class perhaps should also include some other quantifier words. for example ought to be handled in the same way as . # No action here, just a remark. RA <- (!BadNIStress ((R a)/(R i)/(R o))) # quantifier units consisting of a NI or RA root with 00 or 000 appended; to one can further # append a digit to iterate : is four billion, for example. , a few thousand. # a NI1 or RA1 may be followed by a pause before another NI word other than a numerical predicate; # one is allowed to breathe in the middle of long numerals. I question whether the pause # provision makes sense in RA1. NI1 <- ((NI0 (!BadNIStress M a)? (!BadNIStress M OA NI0*)?) (comma2 !(NI RA) &NI)?) RA1 <- ((RA (!BadNIStress M a)? (!BadNIStress M OA NI0*)?) (comma2 !(NI RA) &NI)?) # a composite NI word, optional SA prefix before a sequence of NI words or a RA word, # or a single SA word [which will modify a default quantifier not expressed], # possibly negated, connected with CA0 roots to other such constructs. NI2 <- (( (SA? (NI1+/RA1))/SA) NOI? (CA0 ((SA? (NI1+/RA1))/SA) NOI?)*) # a full NI word with an acronymic dimension (starting with , ending with a pause) or appended. I need to look up # and figure out its semantics. An arbitrary name word may now be used as a dimension, as well. NI <- ([ ]* NI2 (&(M UE) Acronym (comma/&end/&period) !(C u)/comma2? M UE comma2? PreName !(C u))? (C u)?) # mex is now identical with NI, but it's in use in later rules. mex <- ([ ]* NI) # a word used for various tightly binding constructions: a sort of verbal hyphen. # also a name marker, which means phonetic care is needed (pause after constructions with ). CI <- ([ ]* (C i)) # Acronyms, which are names (not predicates as in 1989 Loglan) or dimensions (in NI above). # units in acronym are TAI0 letterals, zV short forms for vowels, the dummy unit , and NI1 # quantity units. NI1 quantity units may not be initial. units may be preceded by pauses. # An acronym has at least two units. # it is worth noting that acronyms, once viewed as names, could be entirely suppressed as a feature of the # grammar by really making them names (terminate them with -n). I suppose a similar approach would work # for dimensions, allowing any name word to serve as a dimension. would be a name marker for use # with dimensions in this case. , three dollars. Now supported. Acronym <- ([ ]* &caprule ((M UE)/TAI0/(Z V2 !V2)) ((comma &Acronym M UE)/NI1/TAI0/(Z V2 (!V2/(Z &V2))))+) # the full class of letterals, including the construction whose details I should look at. TAI <- ([ ]* (TAI0/((G AO) !V2 [ ]* (PreName/Predicate/CmapuaUnit)))) # atomic non-letteral pronouns. DA0 <- ((T AO)/(T IO)/(T UA)/(M IO)/(M IU)/(M UO)/(M UU)/(T OA)/(T OI)/(T OO)/(T OU)/(T UO)/(T UU)/(S UO)/(H u)/(B a)/(B e)/(B o)/(B u)/(D a)/(D e)/(D i)/(D o)/(D u)/(M i)/(T u)/(M u)/(T i)/(T a)/(M o)) # letterals (not including constructions and atomic pronouns optionally suffixed with a digit. One should pause after the # suffixed forms, because is a name marker. DA1 <- ((TAI0/DA0) (C i ![ ] NI0)?) # general pronoun words. DA <- ([ ]* DA1) # roots for PA words: tense and location words, prepositions building relative modifiers. All can optionally be negated with -noi. They may also be quantified. They may also be closed with ZI class affixes. PA cores. PA0 <- (NI2? (N u !KOU)? ((G IA)/(G UA)/(P AU)/(P IA)/(P UA)/(N IA)/(N UA)/(B IU)/(F EA)/(F IA)/(F UA)/(V IA)/(V II)/(V IU)/(C OI)/(D AU)/(D II)/(D UO)/(F OI)/(F UI)/(G AU)/(H EA)/(K AU)/(K II)/(K UI)/(L IA)/(L UI)/(M IA)/(N UI)/(P EU)/(R OI)/(R UI)/(S EA)/(S IO)/(T IE)/(V a)/(V i)/(V u)/(P a)/(N a)/(F a)/(V a)/(KOU !(N OI) !KI)) (N OI)? ZI?) # the form used for actual prepositions and suffixes to A words, with minimal pauses allowed. # these are built by concatenating KOU2 and PA0 units, then linking these with CA0 roots (which can take # no- prefixes and -noi suffixes, and next to which one *can* pause), optionally suffixed with a class ZI suffix. PANOPAUSES <- ((KOU2/PA0)+ ((comma2? CA0 comma2?) (KOU2/PA0)+)*) # prepositional words PA3 <- ([ ]* PANOPAUSES) # class PA can appear as tense markers or as relative modifiers without arguments; here pauses # are allowed not only next to CA0 units but between KOU2/PA units. Like NI words, PA # words are a class of arbitrary length constructions, and we think breaths within them # (especially complex ones) are natural. PA <- ((KOU2/PA0)+ (((comma2? CA0 comma2?)/(comma2 !mod1a)) (KOU2/PA0)+)*) !modifier PA2 <- ([ ]* PA) GA <- ([ ]* (G a)) # the class of tense markers which can appear before predicates. PA1 <- ((PA2/GA)) # suffixes which indicate extent or remoteness/proximity of the action of prepositions. ZI <- ((Z i)/(Z a)/(Z u)) # the primitive description building "articles". These include which requires special # care in its use because it is a name marker. LE <- ([ ]* ((L EA)/(L EU)/(L OE)/(L EE)/(L AA)/(L e)/(L o)/(L a))) # articles which can be used with abstract descriptions: these include some quantity words. # this means that some abstract descriptions are semantically indefinites: I wonder if this # could be improved by having a separate abstract indefinite construction. LEFORPO <- ([ ]* ((L e)/(L o)/NI2)) # the numerical/quantity article. LIO <- ([ ]* (L IO)) # structure words for the ordered and unordered list constructions. LAU <- ([ ]* (L AU)) LOU <- ([ ]* (L OU)) LUA <- ([ ]* (L UA)) LUO <- ([ ]* (L UO)) ZEIA <- ([ ]* Z EIb a) ZEIO <- ([ ]* Z EIb o) # initial and final words for quoting Loglan utterances. LI1 <- (L i) LU1 <- (L u) # quoting Loglan utterances, with or without explicit double quotes (if they appear, they must # appear on both sides). The previous version allowed quotation of names; likely this should # be restored. LI <- ([ ]* LI1 comma2? utterance0 comma2? LU1/[ ]* LI1 comma2? [\"] utterance0 [\"] comma2? LU1) # the foreign name construction. This is an alien text construction LAO <- ([ ]* &([Ll] [Aa] [Oo]juncture?) AlienWord) # the strong quotation construction. This is an alien text construction. LIE <- ([ ]* &([Ll] [Ii] juncture? [Ee]juncture?) AlienWord) # I am not sure this class is used at all. LW <- Cmapua # articles for quotation of words LIU0 <- ((L IU)/(N IU)) # this now imposes the condition that an explicit comma pause (or terminal punctuation, or end) must appear at the end of the # Word or PreName quoted with . This seems like a good idea, anyway. # this class appeals to the phonetics. Words and PreNames can be quoted. The ability to quote names # here may remove the need to quote them with
  • .... Of course, some Words are in fact phrases rather # than single words: we will see whether the privileges afforded are used. The final clause allows # use of letterals as actual names of letters. # added : didn't make it a name marker. LIU1 <- ([ ]* ([Ll]/[Nn])[iI] juncture? [Uu] juncture? !V1 comma2? (PreName/Word) &(comma/terminal/end) /[ ]*(L II TAI )) # the construction of foreign and onomatopoeic predicates. These are alien text constructions. SUE <- ([ ]* &([Ss] [Uu] juncture? [Ee] juncture?/[Ss] [Aa] [Oo] juncture?) AlienWord) # left marker in a predicate metaphor construction CUI <- ([ ]* (C UI) ) # other uses of GA GA2 <- ([ ]* (G a) ) # ge/geu act as "parentheses" to make an atomic predicate from a complex metaphorically # and logically connected predicates; has other left marking uses. GE <- ([ ]* (G e) ) GEU <- ([ ]* ((C UE)/(G EU)) ) # final marker of a list of head terms GI <- ([ ]* ((G i)/(G OI)) ) # used to move a normally prefixed metaphorical modifier after what it modifies. GO <- ([ ]* (G o) ) # marker for second and subsequent arguments before the predicate; NEW GIO <- ([ ]* (G IO) ) # the generic right marker of many constructions. GU <- ([ ]* (G u) ) # various flavors of right markers. # It should be noted that at one point I executed a program of simplifying these to # reduce the likelihood that multiple 's would ever be needed to close an utterance. # first of all, I made the closures leaner, moving them out of the classes closed # to their clients so that they generally can be used only when needed. # Notably, the grammar of is quite different. Second, # I introduced some new flavors of right marker. All can be realized with , # but if one knows the right flavor one can close the right structure with a single # right closure. # right markers of subordinate clauses (argument modifiers). # closes a different class than in the trial.85 grammar, with # similar but on the whole better results. GUIZA <- ([ ]* (G UI) (Z a) ) GUIZI <- ([ ]* (G UI) (Z i) ) GUIZU <- ([ ]* (G UI) (Z u) ) GUI <- (!GUIZA !GUIZI !GUIZU ([ ]* (G UI) )) # right markers of abstract predicates and descriptions. # probably the forms with z are to be preferred (and the other # two are not needed) but I preserve all five classes for now. GUO <- ([ ]* (G UO) ) GUOA <- ([ ]* (G UOb a/G UO Z a) ) GUOE <- ([ ]* (G UOb e) ) GUOI <- ([ ]* (G UOb i/G UO Z i) ) GUOO <- ([ ]* (G UOb o) ) GUOU <- ([ ]* (G UOb u/G UO Z u) ) # right marker used to close term (argument/predicate modifier) lists. # it is important to note that in our grammar GUU is not a component of # the class termset, nor is it a null termset: it appears in other classes # which include termsets as an option to close them. The effects are similar # to those in the trial.85 grammar, but there is less of a danger that # extra unexpected closures will be needed. GUU <- ([ ]* (G UU) ) # a new closure for arguments in various contexts GUUA <- ([ ]* (G UUb a) ) # a new closure for sentences. In particular, it # may have real use in closing up the scope of a list of # fronted terms before a series of logically connected sentences. GIUO <- ([ ]* (G IUb o) ) # right marker used to close arguments tightly linked with JE/JUE. GUE <- ([ ]* (G UE) ) # a new closure for descpreds GUEA <- ([ ]* (G UEb a) ) # used to build tightly linked term lists. JE <- ([ ]* (J e) ) JUE <- ([ ]* (J UE) ) # used to build subordinate clauses (argument modifiers). JIZA <- ([ ]* ((J IE)/(J AE)/(P e)/(J i)/(J a)/(N u J i)) (Z a) ) JIOZA <- ([ ]* ((J IO)/(J AO)) (Z a) ) JIZI <- ([ ]* ((J IE)/(J AE)/(P e)/(J i)/(J a)/(N u J i)) (Z i) ) JIOZI <- ([ ]* ((J IO)/(J AO)) (Z i) ) JIZU <- ([ ]* ((J IE)/(J AE)/(P e)/(J i)/(J a)/(N u J i)) (Z u) ) JIOZU <- ([ ]* ((J IO)/(J AO)) (Z u) ) JI <- (!JIZA !JIZI !JIZU ([ ]* ((J IE)/(J AE)/(P e)/(J i)/(J a)/(N u J i)) )) JIO <- (!JIOZA !JIOZI !JIOZU ([ ]* ((J IO)/(J AO)) )) # case tags, both numerical position tags and the optional semantic case tags. DIO <- ([ ]* ((B EU)/(C AU)/(D IO)/(F OA)/(K AO)/(J UI)/(N EU)/(P OU)/(G OA)/(S AU)/(V EU)/(Z UA)/(Z UE)/(Z UI)/(Z UO)/(Z UU)) ) # markers of indirect reference. Originally these had the same grammar as case tags, # but they are now different. LAE <- ([ ]* ((L AE)/(L UE)) ) # turns arguments into predicates, closes this construction. ME <- ([ ]* ((M EA)/(M e)) ) MEU <- ([ ]* M EU ) # reflexive and conversion operators: first the root forms, then those with # optional numerical suffixes. NU0 <- ((N UO)/(F UO)/(J UO)/(N u)/(F u)/(J u)) NU <- ([ ]* (NU0 !([ ]+ (NI0/RA)) (NI0/RA)? freemod?)+ ) # abstract predicate constructors (from sentences) # I do *not* think # that will really be confused with , particularly # since we do require an explicit pause before in the latter case, # but I record this concern: the forms with z might be preferable. PO1 <- ([ ]* ((P o)/(P u)/(Z o))) PO1A <- ([ ]* ((P OIb a)/(P UIb a)/(Z OIb a)/(P o Z a)/(P u Z a)/(Z o Z a))) PO1E <- ([ ]* ((P OIb e)/(P UIb e)/(Z OIb e))) PO1I <- ([ ]* ((P OIb i)/(P UIb i)/(Z OIb i)/(P o Z i)/(P u Z i)/(Z o Z i))) PO1O <- ([ ]* ((P OIb o)/(P UIb o)/(Z OIb o))) PO1U <- ([ ]* ((P OIb u)/(P UIb u)/(Z OIb u)/(P o Z u)/(P u Z u)/(Z o Z u))) # abstract predicate constructor from simple predicates POSHORT1 <- ([ ]* ((P OI)/(P UI)/(Z OI))) # word forms associated with the above abstract predicate root forms PO <- ([ ]* PO1 ) POA <- ([ ]* PO1A ) POE <- ([ ]* PO1E ) POI <- ([ ]* PO1E ) POO <- ([ ]* PO1O ) POU <- ([ ]* PO1U ) POSHORT <- ([ ]* POSHORT1 ) # register markers DIE <- ([ ]* ((D IE)/(F IE)/(K AE)/(N UE)/(R IE)) ) # vocative forms: I still have the words of social lubrication as # vocative markers. HOI <- ([ ]* ((H OI)/(L OI)/(L OA)/(S IA)/(S IE)/(S IU)) ) # the verbal scare quote. The numerical suffix indicates how many preceding words are affected; # this is an odd mechanism. JO <- ([ ]* (NI0/RA)? (J o) ) # markers for forming parenthetical utterances as free modifiers. KIE <- ([ ]* (K IE) ) KIU <- ([ ]* (K IU) ) KIE2 <- [ ]* K IE comma2? [(] KIU2 <- [ ]* [)] comma2? K IU # marker for forming smilies. SOI <- ([ ]* (S OI) ) # a grab bag of attitudinal words, including but not restricted to the VV forms. UI0 <- (!predstart (!([Ii] juncture? [Ee]) VV juncture?/(B EA)/(B UO)/(C EA)/(C IA)/(C OA)/(D OU)/(F AE)/(F AO)/(F EU)/(G EA)/(K UO)/(K UU)/(R EA)/(N AO)/(N IE)/(P AE)/(P IU)/(S AA)/(S UI)/(T AA)/(T OE)/(V OI)/(Z OU)/((L OI))/((L OA))/((S IA))/(S II)/(T OE)/((S IU))/(C AO)/(C EU)/((S IE))/(S EU))) # negative forms of the attitudinals. The ones with before the two vowel forms are a phonetic exception. The others # should also be (though they present no pronunciation problem) so that they are resolved as single words. NOUI <- (([ ]* N [o] juncture? [ ]* UI0 )/([ ]* UI0 NOI )) # all attitudinals (adding the discursives nefi, tofi... etc) # there is a technical problem with mixing UI0 roots of VV and CVV shapes. UI1 <- ([ ]* (UI0+/(NI F i)) ) # the inverse vocative marker HUE <- ([ ]* (H UE)) # occurrences of as a word rather than an affix. NO1 <- ([ ]* !KOU1 !NOUI (N o) !([ ]* KOU) !([ ]* (JIO/JI/JIZA/JIOZA/JIZI/JIOZI/JIZU/JIOZU)) ) # Names, acronyms and PreNames from above. AcronymicName <- Acronym &(comma/period/end) DJAN <- (PreName/AcronymicName) # predicate words which are phonetically cmapua # "identity predicates". Converses are provided as a new proposal. BI <- ([ ]* (N u)? ((B IA)/(B IE)/(C IE)/(C IO)/(B IA)/(B [i])) ) # interrogative and pronoun predicates LWPREDA <- ((H e)/(D UA)/(D UI)/(B UA)/(B UI)) # here I should reinstall the proposal. # the predicate words defined above in the phonetics section Predicate <- Complex # predicate words, other than the "identity predicates" of class [BI] # these include the numerical predicates (NI RA), also cmapua phonetically. PREDA <- ([ ]* &caprule (Predicate/LWPREDA/(![ ] NI RA)) ) # Part 3: The Grammar Proper # right markers turned into classes. guoa <- (PAUSE? (GUOA/GU) freemod?) guoe <- (PAUSE? (GUOE/GU) freemod?) guoi <- (PAUSE? (GUOI/GU) freemod?) guoo <- (PAUSE? (GUOO/GU) freemod?) guou <- (PAUSE? (GUOU/GU) freemod?) guo <- (!guoa !guoe !guoi !guoo !guou (PAUSE? (GUO/GU) freemod?)) guiza <- (PAUSE? (GUIZA/GU) freemod?) guizi <- (PAUSE? (GUIZI/GU) freemod?) guizu <- (PAUSE? (GUIZU/GU) freemod?) gui <- (PAUSE? (GUI/GU) freemod?) gue <- (PAUSE? (GUE/GU) freemod?) guea <- (PAUSE? (GUEA/GU) freemod?) guu <- (PAUSE? (GUU/GU) freemod?) guua <- (PAUSE? (GUUA/GU) freemod?) giuo <- (PAUSE? (GIUO/GU) freemod?) meu <- (PAUSE? (MEU/GU) freemod?) geu <- GEU # Here note the absence of pause/GU equivalence. gap <- (PAUSE? GU freemod?) # this is the vocative construction. It can appear early because all of its components are marked. # the intention is to indicate who is being addressed. This can be handled via a name, a descriptive argument, a predicate or an # alien text name (the last must be quoted). The complexities of these grammatical constructions can be deferred until they are # introduced. # HOI0 <- [ ]* [Hh] [Oo] [Ii] juncture? # restore words of social lubrication as vocative markers but not as name markers: # I do not allow a freemod to intervene between a vocative marker and the associated # utterance, to avoid unintended grabbing of subjects by the words of social lubrication when they are used # as vocative markers. This lets and be equivalent. The comma needed in the # first because the social lubrication words are in this version not name markers. HOI0 <- ([ ]* ((([Hh] OI)/([Ll] OI)/([Ll] OA)/([Ss] IA)/([Ss] IE)/([Ss] IU)))) juncture? !V1 voc <- (HOI0 comma2? name /(HOI comma2? descpred guea? (comma2 &(!FalseMarked PreName/[Cc][Ii] juncture? comma2 (PreName/AcronymicName)) name)?)/(HOI comma2? argument1 guua?)/[ ]* &([Hh] [Oo] [Ii] juncture?) AlienWord) # this is the inverse vocative. It can appear early because all of its components are marked. # the intention is to indicate who is speaking. The range of ways this can be handled is similar to the range of ways it can be # handled for the vocative; there is the further option of a sentence (the [statement] class) and there is a strong closure option # for the case where an argument is used (to avoid it inadvertantly expanding to a sentence). HUE0 <- [ ]* &caprule [Hh] [Uu] juncture? [Ee] juncture? !V1 invvoc <- (HUE0 comma2? name/HUE freemod? descpred guea? (comma2 &(!FalseMarked PreName/[Cc][Ii] juncture? comma2 (PreName/AcronymicName)) name)?/(HUE freemod? statement giuo?)/(HUE freemod? argument1 guu?)/[ ]* &([Hh] [Uu] juncture? [Ee] juncture?) AlienWord) # this is the class of free modifiers. Most of its components are head marked (those that aren't appear just above), # and it is useful for it to appear early because these things appear everywhere in subsequent constructions. A free modifier, # of whatever sort, is a freely insertable gadget which modifies the immediately preceding construction, or the entire utterance # if it is initial. # NOUI is a negated attitudinal word. UI1 is an attitudinal word: these express an emotional attitude toward the # assertion (noting that EI marks questions (yes or no answer expected) and SEU marks utterances as answers). # SOI creates smilies in a general sense: indicates that the listener should imagine the speaker smiling; # similarly for other predicates. # DIE and NO DIE are register markers, communicating the social attitude of the speaker toward the one addressed: for # example is "dear" # KIE...KIU constructs a full parenthetical utterance as a comment, which can be enclosed in actual parentheses inside # the marker words. # JO is a scare quote device. # the comma is a freemod with no semantic content: this is a device for discarding phonetically required pauses # and the speaker's optional pauses alike. The pause before a non-pause marked prename is part of the NameWord and so # is excluded. Ellipses and dashes are fancy pauses supported as freemods. freemod <- ((NOUI/(SOI freemod? descpred guea?)/DIE/(NO1 DIE)/(KIE comma? utterance0 comma? KIU)/(KIE2 comma? utterance0 comma? KIU2)/invvoc/voc/(comma !(!FalseMarked PreName))/JO/UI1/([ ]* '...' ([ ]* &letter)?)/([ ]* '--' ([ ]* &letter)?)) freemod?) # the classes juelink to linkargs describe very tightly bound arguments which can be firmly attached to predicates in # the context of metaphorical modifications and the use of predicates in descriptive arguments. # note that we allow predicate modifiers (prepositional phrases) to be bound with which is not # allowed in 1989 Loglan, but which we believe is supported in Lojban. juelink <- (JUE freemod? (term/(PA2 freemod? gap?))) links1 <- (juelink (freemod? juelink)* gue?) links <- ((links1/(KA freemod? links freemod? KI freemod? links1)) (freemod? A1 freemod? links1)*) jelink <- (JE freemod? (term/(PA2 freemod? gap?))) linkargs1 <- (jelink freemod? (links/gue)?) linkargs <- ((linkargs1/(KA freemod? linkargs freemod? KI freemod? linkargs1)) (freemod? A1 freemod? linkargs1)*) # class abstractpred supports the construction of event, property, and quantity predicates from sentences. These are # closable with if introduced with and closable with suffixed variants of if introduced with suffixed # variants of (a NEW idea but it is clear that closure of these predicates (and of the more commonly # used associated descriptions) is an important issue). abstractpred <- ((POA freemod? uttAx guoa?)/(POA freemod? sentence guoa?)/(POE freemod? uttAx guoe?)/(POE freemod? sentence guoe?)/(POI freemod? uttAx guoi?)/(POI freemod? sentence guoi?)/(POO freemod? uttAx guoo?)/(POO freemod? sentence guoo?)/(POU freemod? uttAx guou?)/(POU freemod? sentence guou?)/(PO freemod? uttAx guo?)/(PO freemod? sentence guo?)) # predunit1 describes the truly atomic forms of predicate. # PREDA is the class of predicate words (the phonetic predicate words along with the special phonetic cmapua which are predicates, listed # above under the PREDA rule. NU PREDA handles permutations and identifications of arguments of PREDAs. # SUE contains the alien text constructions with and , semantically quite different but syntactically handled # in the same way. # ... (the closing optional) can parenthesize a fairly complex predicate phrase and turn it into an atomic form. These # forms can have conversion or reflexive operators (NU) applied. I should look into why the class handled in the conversion case # is different. An important use of this is in metaphor constructions, but it has other potential uses. # abstractpred is the class of abstraction predicates just introduced above. These are treated as atomic in this grammar: it should # be noted that their privileges in the trial.85 grammar are (absurdly) limited. # ... (the closing optional, but important to have available) forms predicates from arguments, the predicate being true of the # objects to which the argument refers. : this is one of the men we are talking about. predunit1 <- ((SUE/(NU freemod? GE freemod? despredE (freemod? geu comma?)?)/(NU freemod? PREDA)/(comma? GE freemod? descpred (freemod? geu comma?)?)/abstractpred/(ME freemod? argument1 meu?)/PREDA) freemod?) # binds very tightly to predunit1: a possibly multiply negated predunit1 (or an unadorned predunit1) is a predunit2. predunit2 <- ((NO1 freemod?)* predunit1) # an instance of NO2 is one not absorbed by a predunit. Example: X is a slow (not-fast) runner vs # (X is not a fast runner, and in fact may not run at all). NO2 <- (!predunit2 NO1) # a predunit3 is a predunit2 with tightly attached arguments. predunit3 <- ((predunit2 freemod? linkargs)/predunit2) # a predunit is a predunit3 or a predunit3 converted by the short-scope abstraction operators # to an abstraction predicate. This is the kind of predicate which can appear as # a component in a serial name. predunit <- ((POSHORT freemod?)? predunit3) # a further "atomic" (because tightly packaged) form is a forethought connected pair # of predicates (this being the full predicate class defined at the end of the process) # possibly closed with , possibly multiply negated as well. # the closure with guu eliminated the historic rule against kekked heads of metaphors. kekpredunit <- ((NO1 freemod?)* KA freemod? predicate freemod? KI freemod? predicate guu?) # there follows the construction of metaphorically modified predicates, # along with tightly logically linked predicates. # CI and simple juxtaposition of predicates both represent modification of the second # predicate by the first. We impose no semantic conditions on this modification, # except in the case of modification by predicates logically linked with CA, # which do distribute logically in the expected way both as modifiers and as modified. # We do not regard as necessarily implying preda2: we do regard # it as having the same place structure as preda2. It is very often but not always # a qualification or kind of preda2; in any case it is a relation analogous to preda2. # modification with CI binds most tightly. # we eliminated the distinction between the series of sentence and description # predicate preliminary classes: there seems to be no need for it even in the # trial.85 grammar. despredA <- ((predunit/kekpredunit) (freemod? CI freemod? (predunit/kekpredunit))*) # this is logical connection of predicates with the tightly binding CA # series of logical connectives. CUI can be used to expand the scope of # a CA connective over a metaphor on the left. ... is used to expand # scope on the right (and could also be used on the left, it should be noted). # descpredC is an internal of despredB assisting the function of CUI. # the !PREDA in front of CUI is probably not needed. despredB <- ((!PREDA CUI freemod? despredC freemod? CA freemod? despredB)/despredA) despredC <- (despredB (freemod? despredB)*) # tight logical linkage of despredB's despredD <- (despredB (freemod? CA freemod? despredB)*) # chain of modifications of despredD's (grouping to the left) despredE <- (despredD (freemod? despredD)*) # the GO construction allows inverse modification: is as it were. # there are profound effects on grouping. descpred <- ((despredE freemod? GO freemod? descpred)/despredE) # this version which appears in sentence predicates as opposed to descriptions differs # in allowing loosely linked arguments (termsets) instead of those linked with for the predicate # moved to the end by GO. sentpred <- ((despredE freemod? GO freemod? barepred)/despredE) # the construction of predicate modifiers (prepositional phrases usable as terms along with arguments). mod1a <- (PA3 freemod? argument1 guua?) # note special treatment of predicate modifiers without actual arguments. # the !barepred serves to distinguish these predicate modifiers from actual # "tenses" (predicate markers). mod1 <- ((PA3 freemod? argument1 guua?)/(PA2 freemod? !barepred gap?)) # forethought connection of modifiers. There is some subtlety in # how this is handled. kekmod <- ((NO1 freemod?)* (KA freemod? modifier freemod? KI freemod? mod)) mod <- (mod1/((NO1 freemod?)* mod1)/kekmod) # afterthought connection of modifiers modifier <- (mod (A1 freemod? mod)*) # the serial name is a horrid heterogenous construction! It can involve # components of all three of the major phonetic classes essentially! # However, I believe I have the definition right, with all the components # correctly guarded :-) name <- (PreName/AcronymicName) (comma2? !FalseMarked PreName/comma2? &([Cc] [Ii]) NameWord/comma2? CI predunit !(comma2? (!FalseMarked PreName))/comma2? CI AcronymicName)* freemod? LA0 <- [ ]* [Ll] [Aa] juncture? LANAME <- (LA0 comma2? name) # general constructions of arguments with "articles". # the rules here have the "possessive" construction as in embedded. These are not the same # construction in 1989 Loglan, though speakers might think they are. Here they are indeed the same. The "possessor" cannot # be "indefinite" (cannot start with a quantifier word); the possessor can be followed by a tense, as in # , "John's present house", by analogy with , which is accepted by LIP (because # LIP accepts as a word). # there are other subtleties to be reviewed. descriptn <- (!LANAME ((LAU wordset1)/(LOU wordset2)/(LE freemod? ((!mex arg1a freemod?)? (PA2 freemod?)?)? mex freemod? descpred)/(LE freemod? ((!mex arg1a freemod?)? (PA2 freemod?)?)? mex freemod? arg1a)/(GE freemod? mex freemod? descpred)/(LE freemod? ((!mex arg1a freemod?)? (PA2 freemod?)?)? descpred))) # abstract descriptions. Note that abstract descriptions are closed with entirely independently of abstract predicates: # does not have a grammatical component . This avoids the double closure often apparently necessary # in Lojban. abstractn <- ((LEFORPO freemod? POA freemod? uttAx guoa?)/(LEFORPO freemod? POA freemod? sentence guoa?)/(LEFORPO freemod? POE freemod? uttAx guoe?)/(LEFORPO freemod? POE freemod? sentence guoe?)/(LEFORPO freemod? POI freemod? uttAx guoi?)/(LEFORPO freemod? POI freemod? sentence guoi?)/(LEFORPO freemod? POO freemod? uttAx guoo?)/(LEFORPO freemod? POO freemod? sentence guoo?)/(LEFORPO freemod? POU freemod? uttAx guou?)/(LEFORPO freemod? POU freemod? sentence guou?)/(LEFORPO freemod? PO freemod? uttAx guo?)/(LEFORPO freemod? PO freemod? sentence guo?)) # a wider class of basic argument constructions. Notice that LANAME is always read by preference to descriptn. arg1 <- (abstractn/(LIO freemod? descpred guea?)/(LIO freemod? argument1 guua?)/(LIO freemod? mex gap?)/(LIO stringnospaces)/LAO/LANAME/(descriptn guua? (comma2 &(!FalseMarked PreName/[Cc][Ii] juncture? comma2 (PreName/AcronymicName)) name)?)/LIU1/LIE/LI) # this adds pronouns (incl. the fancy letterals) and the option of left marking an argument with arg1a <- ((DA/TAI/arg1/(GE freemod? arg1a)) freemod?) # argument modifiers (subordinate clauses) argmod1 <- ((([ ]* (N o) [ ]*)? ((JI freemod? predicate)/(JIO freemod? sentence)/(JIO freemod? uttAx)/(JI freemod? modifier)/(JI freemod? argument1)))/(([ ]* (N o) [ ]*)? (((JIZA freemod? predicate) guiza?)/((JIOZA freemod? sentence) guiza?)/((JIOZA freemod? uttAx) guiza?)/((JIZA freemod? modifier) guiza?)/(JIZA freemod? argument1 guiza?)))/(([ ]* (N o) [ ]*)? ((JIZI freemod? predicate guizi?)/(JIOZI freemod? sentence guizi?)/(JIOZI freemod? uttAx guizi?)/(JIZI freemod? modifier guizi?)/(JIZI freemod? argument1 guizi?)))/(([ ]* (N o) [ ]*)? ((JIZU freemod? predicate guizu?)/(JIOZU freemod? sentence guizu?)/(JIOZU freemod? uttAx guizu?)/(JIZU freemod? modifier guizu?)/(JIZU freemod? argument1 guizu?)))) # we improved the trial.85 grammar by closing not argmod1 but argmod with . But the labelled argument modifier constructors # when building an argmod1 have the argmod1 construction closed with the corresponding labelled right marker, of course. Thus # gui and guiza actually have different grammar. # trial.85 did not provide forethought connected argument modifiers, and we also see no need for them, # though they could readily be added. argmod <- (argmod1 (A1 freemod? argmod1)* gui?) # affix argument modifiers to a definite argument arg2 <- (arg1a freemod? argmod*) # build a possibly indefinite argument from an argument: to le mrenu arg3 <- (arg2/(mex freemod? arg2)) # build an indefinite argument from a predicate indef1 <- (mex freemod? descpred) # affix an argument modifier to an indefinite argument indef2 <- (indef1 guua? argmod*) indefinite <- indef2 # link arguments with the fusion connective arg4 <- ((arg3/indefinite) (ZE2 freemod? (arg3/indefinite))*) # forethought connection of arguments. Note use of argx arg5 <- (arg4/(KA freemod? argument1 freemod? KI freemod? argx)) # arguments with possible negations followed by possible indirect reference constructions. argx <- ((NO1 freemod?)* (LAE freemod?)* arg5) # afterthought connection with the tightly binding ACI connectives arg7 <- (argx freemod? (ACI freemod? argx)?) # afterthought connection with the usual A connectives. Can't start with GE # to avoid an ambiguity (to which 1989 Loglan is vulnerable) involving AGE connectives. arg8 <- (!GE (arg7 freemod? (A1 freemod? arg7)*)) # afterthought connection (now right grouping, instead of the left grouping above) # using the AGE connectives. GUU can be used to affix an argument modifier at this top level. argument1 <- (((arg8 freemod? AGE freemod? argument1)/arg8) (GUU freemod? argmod)*) # possibly negated and case tagged arguments. We (unlike 1989 Loglan) are careful # to use argument only where case tags are appropriate. argument <- ((NO1 freemod?)* (DIO freemod?)* argument1) # these classes are exactly argument, but are used to signal # which argument position after the predicate an argument occupies. # I think the grammar is set up so that these will actually # never be case tagged, though the grammar does not expressly forbid it. argumentA <- argument argumentB <- argument argumentC <- argument argumentD <- argument # an argument which is actually case tagged. argxx <- (&((NO1 freemod?)* DIO) argument) # arguments and predicate modifiers actually associated with predicates. term <- (argument/modifier) # a term list consisting entirely of modifiers. modifiers <- (modifier (freemod? modifier)*) # a term list consisting entirely of modifiers and tagged arguments. modifiersx <- ((modifier/argxx) (freemod? (modifier/argxx))*) # a general term list. It cannot contain more than four untagged arguments (they will be labelled # with the lettered subclasses given above). terms <- ((modifiersx? argumentA (freemod? modifiersx)? argumentB? (freemod? modifiersx)? argumentC? (freemod? modifiersx)? argumentD?)/modifiersx) # innards of ordered and unordered list constructions. These are something I totally rebuilt, as they were in a totally # unsatisfactory state in trial.85. Note the use of comma words to separate items in lists. word <- (arg1a/indef2) words1 <- (word (ZEIA word)*) words2 <- (word (ZEIO word)*) wordset1 <- (words1? LUA) wordset2 <- (words2? LUO) # the full term set type to be affixed to predicates. # forethought connection of term lists termset1 <- (terms/(KA freemod? termset2 freemod? guu? KI freemod? termset1)) # afterthought connection of term lists. There are cunning things going on here getting # to work correctly. Note that is NOT a null term list as it was in trial.85. termset2 <- (termset1 (guu &A1)? (A1 freemod? termset1 (guu &A1)?)*) # there is an interesting option here of a list of terms followed by followed by a predicate # intended to metaphorically modify the predicate to which the terms are affixed. Is there a reason # why we cannot have a more complex construction in place of terms? termset <- ((terms freemod? GO freemod? barepred)/termset2) # this is the untensed predicate with arguments attached. Here is the principal locus # of closure with , but it is deceptive to say that merely closes barepred, # as we have seen above, for example in [termset2]. barepred <- (sentpred freemod? ((termset guu?)/(guu &termset))?) # tensed predicates markpred <- (PA1 freemod? barepred) # there follows an area in which my grammar looks different from trial.85. Distinct parallel forms for # marked and unmarked predicates are demonstrably not needed even in trial.85. The behavior of the ACI # connectives is plain weird in trial.85; here we treat ACI connectives in the same way as A connectives, but # binding more tightly. # units for the ACI construction following -- possibly multiply negated bare or marked predicates. # adding shared termsets to logically connected predicates are handled differently here than in trial.85, # which uses a very elegant but dreadfully left-grouping rule which a PEG cannot handle. Any realistic situation # should be manageable. backpred1 <- ((NO2 freemod?)* (barepred/markpred)) # ACI connected predicates. Shared termsets are added. Notice how we first group backpred1's then recursively # group backpreds. backpred <- (((backpred1 (ACI freemod? backpred1)+ freemod? ((termset guu?)/(guu &termset))?) ((ACI freemod? backpred)+ freemod? ((termset guu?)/(guu &termset))?)?)/backpred1) # A connected predicates; same comments as just above. Cannot start with GE to fix ambiguity with AGE connectives. predicate2 <- (!GE (((backpred (A1 !GE freemod? backpred)+ freemod? ((termset guu?)/(guu &termset))?) ((A1 freemod? predicate2)+ freemod? ((termset guu?)/(guu &termset))?)?)/backpred)) # predicate2's linked with right grouping AGE connectives (A and ACI are left grouping). predicate1 <- ((predicate2 AGE freemod? predicate1)/predicate2) # identity predicates from above, possibly negated identpred <- ((NO1 freemod?)* (BI freemod? argument1 guu?)) # predicates in general. Note that identity predicates cannot be logically connected # except by using forethought connection (see above). predicate <- (predicate1/identpred) # the subject class is a list of terms (arguments and predicate modifiers) in which all but possibly one # of the arguments are tagged, and there is at least one argument, tagged or otherwise. subject <- ((modifiers freemod?)? ((argxx subject)/(argument (modifiersx freemod?)?))) # The gasent is a basic form of the Loglan sentence in which the predicate leads. # The basic structure is ) followed optionally by terms followed optionally by # followed by terms. The list of terms after (if present) will either contain # at least one argument and no more than one untagged argument # (a subject) [gasent1] or all the arguments of the predicate [gasent2]. We deprecate other arrangements possible in # 1989 Loglan because they would cause unexpected reorientation of the arguments already given before as second # and further arguments were read after . [barepred] is an untensed predicate possibly with arguments; [sentpred] # is "simply a verb", i.e., a predicate without arguments. # there is a semantic change from 1989 Loglan reflected in a grammar change here: # in [gasent1] the final (ga subject) is optional. When it does not appear, the resulting # sentence is an observative (a sentence with subject omitted), not an imperative. # Imperatives for us are unmarked. gasent1 <- ((NO1 freemod?)* (PA1 freemod? barepred (GA2 freemod? subject)?)) gasent2 <- ((NO1 freemod?)* (PA1 freemod? sentpred modifiers? (GA2 freemod? subject freemod? GIO? freemod? terms?))) gasent <- (gasent2/gasent1) # this is the simple Loglan sentence in various basic orders. The form "gasent" is discoussed just above. # Predicate modifiers # can be prefixed to the gasent. The final form given here is the basic SVO sentence. The "subject" class is a list of terms #(arguments and predicate modifiers) containing at most one un-case-tagged argument. The most general SVO form is subject, followed optionally #by followed by a list of terms (1989 Loglan allowed more than one untagged argument before the predicate, but this leads to practical problems #in which preceding constructions with errors in them may supply extra unintended arguments. It should be noted in NB3 that JCB envisioned #a single argument before the predicate, followed by the predicate, which may itself contain further arguments. A gasent nay optionally be negated #(even multiple times). statement <- (gasent/(modifiers freemod? gasent)/(subject freemod? (GIO freemod? terms)? predicate)) # this is a forethought connected basic sentence. It is odd (and actual odd results can be exhibited) that the final segment in both # of these rules is of the very general class uttA1, which includes some quite fragmentary utterances usually intended as answers. # 12/20/2017 I rewrote the rule in a more compact form. This rule looks ahead to the class [sentence] which we now develop; # for the moment notice that [sentence] will include [statement]. keksent <- (NO1 freemod?)* (KA freemod? headterms? freemod? sentence freemod? KI freemod? uttA1) # sentence negation. We allow this to be set off from the main sentence with a mere pause, because generally # it does not differ in meaning from the result of negating the first argument or predicate modifier. neghead <- ((NO1 freemod? gap)/(NO2 PAUSE)) # this class includes [statement], predicate modifiers preceding a predicate (which may contain arguments), a statement, # a predicate, and a keksent. Of these, the first and third are imperatives. sen1 <- ((neghead freemod?)* ((modifiers freemod? !gasent predicate)/statement/predicate/keksent)) # the class [sentence] consists of sen1's afterthought connected with A connectives sentence <- (sen1 (ICA freemod? sen1)*) # [headterms] is a list of terms (arguments and predicate modifiers) ending in . Preceding a [sen1] with these # causes all predicates in the [sen1] to share these arguments. We propose either that the headterms arguments be directly # appended to the argument list of each component of the [sen1], or that there is an argument with a numbered case tag at the beginning # of the headterms list, and the list is inserted at the appropriate position in each component sentence. Neither of these is # the condition described in Loglan I, which presupposes that we always know what the last argument of each predicate used is. headterms <- (terms GI)+ # this is the previous class prefixed with a list of fronted terms. # we think the closure might prove useful. uttAx <- (headterms freemod? sentence giuo?) # weird answer fragments uttA <- ((A1/mex) freemod?) # a broad class of utterances, including various things one would usually only say as answers. Notice # that this utterance class can take terminal punctuation. uttA1 <- ((sen1/uttAx/links/linkargs/argmod/(modifiers freemod? keksent)/terms/uttA/NO1) freemod? period?) # possibly negated utterances of the previous class. uttC <- ((neghead freemod? uttC)/uttA1) # utterances linked with more tightly binding ICI sentence connectives. Single sentences are of this class # if not linked with ICI or ICA. uttD <- ((sentence period? !ICI !ICA)/(uttC (ICI freemod? uttD)*)) # utterances of the previous class linked with ICA. I went to some trouble to ensure that a freestanding # [sentence] is actually parsed as a sentence, not a composite uttD, which was a deficiency, if not an ambiguity of # LIP and of the trial.85 grammar. uttE <- (uttD (ICA freemod? uttD)*) # utterances of the previous class linked with I sentence connectives. uttF <- (uttE (I freemod? uttE)*) # the utterance class for use in the context of parenthetical freemods or quotations, in which it does not go to end of text. utterance0 <- (!GE ((!PAUSE freemod period? utterance0)/(!PAUSE freemod period?)/(uttF IGE utterance0)/uttF/(I freemod? uttF?)/(I freemod? period?)/(ICA freemod? uttF)) (&I utterance0)?) # Notice that there are two passes here: the parser first checks that the entire utterance # is phonetically valid, then returns and checks for grammatical validity. # the full utterance class. This goes to end of text, and incorporates the phonetics check. This incorporates the only situations # in which a freemod is initial. The IGE connectives bind even more loosely than the I connectives and right-group instead of # left grouping. utterance <- &(phoneticutterance !.) (!GE ((!PAUSE freemod period? utterance)/(!PAUSE freemod period? (&I utterance)? end)/(uttF IGE utterance)/(I freemod? period? (&I utterance)? end)/(uttF (&I utterance)? end)/(I freemod? uttF (&I utterance)? end)/(ICA freemod? uttF (&I utterance)? end)))