The Parsing Expression Grammar (PEG) for Loglan

This is the source for the PEG Loglan parser.

NOTE (5/22): the PEG this is built on (afresh) is that for the "alternative" parser. Differences between the alternative and the official parser are pointed out as necessary. The "alternative" version works better in parsing legacy Loglan text, because it handles fronted modifiers for which the ancients relied on pause/GU equivalence correctly (without needing any pauses).

PEG notation

A PEG (Parsing Expression Grammar) is made up of lines of the form class_name <- PEG notation Each PEG notation describes a set of strings with conditions on the context in which they occur.

Concrete strings: 'string' or "string" literally denotes the 6 character string given.

Classes of characters: [aeiou] describes the set of one character strings which are either a, e, i, o, or u. Ranges can appear: [a-zA-z] describes the union of the sets of lower case letters and upper case letters, considered as one character strings.

If A and B are PEG notations, (A B) denotes a string of class A followed by a string of class B (in which the string of class A is the preferred string of this class read from the beginning of the source string).

If A and B are PEG notations, (A / B) denotes a string of either class A or a string of class B, with a string of class A being read by preference if possible. The fact that a preference is indicated in alternative lists makes PEG reading deterministic (in a sense, there are no ambiguities for a PEG grammar). The problem corresponding to ambiguity in a BNF grammar is incorrectly ordered lists of alternatives.

If A is a PEG notation, (A)? represents a string of class A (preferred) or an empty string if there is no string of class A: this represents optional appearance of A. (A)* represents zero or more consecutive strings of class A (as many as possible) and (A)+ represents one or more consecutive such strings.

If A is a PEG notation, &(A) represents a length 0 string which is followed by a string of class A, and !(A) represents a length 0 string which is not followed by a string of class A. This gives us powerful lookahead features: for example, ((A)! B represents a string of class B whose beginning is not also the beginning of a string of class A: it is tempting but not accurate to say that it does not have an initial segment of class A, because detection of a string of class A longer than the string of class B read would cause reading of this class to fail.

The period . represents the class of single characters (so !. is end of text).

New notations are introduced by lines

class_name <- PEG notation:

this is not just an abbreviation facility because such definitions may be mutually recursive.

A PEG notation applied to a source string will give either failure or a uniquely determined initial string of the source (parsed suitably); in a sense PEG is unambiguous. What corresponds as an issue to ambiguity for a BNF grammar is inappropriate choice of order of alternatives in PEG disjunctions (A / B): what often represents a problem with a grammar is what I call "preemption", where an earlier alternative reads an initial segment of a string where a later alternative could have read more of it.

It's possible to have a PEG go into an infinite loop and fail to produce a parse. My PEG generator has a termination checker, so the Loglan grammar does not have these problems. I have contemplated writing a preemption checker, but this is a rather difficult problem.


Letters and symbols

Letters (excluding q,w,x).

lowercase <- (!([qwx]) [a-z])

uppercase <- (!([QWX]) [A-Z])

letter <- (!([QWXqwx]) [A-Za-z])

Syllable breaks and stress markers.

juncture <- (([-] &(letter)) / ([\'*] !(juncture)))

stress <- ([\'*] !(juncture))

The form of juncture2 enforces the rule that one must pause after a stressed cmapua syllable before a consonant-initial predicate. What happens is that the hyphen or stress marker cannot be followed immediately by a consonant-initial predicate, but it can be followed by an explicit comma pause (in this case parsed as part of the juncture!) which is followed by a consonant-initial predicate. A pause is required before a vowel-initial predicate as well, but it does not have to be explicitly comma-marked (as a pause is obligatory before any vowel-initial word in any case).

juncture2 <- ((([-] &(letter)) / ([\'*] !((([ ])* &(C1) Predicate)) ((', ' ([ ])* &(C1) &(Predicate)))?)) !(juncture))

These are classes of characters which can appear in words (letters and junctures); the first one excludes uppercase letters (except that it allows upper case letters after junctures, a subtle implementation of the capitalization rule).

Lowercase <- (lowercase / (juncture (letter)?))

Letter <- (letter / juncture)

Pauses. The first is the ordinary explicit pause. The second is an internal pause allowed inside certain kinds of "words"; it may be expressed as a space or a comma pause, and the capitalization rule propagates through it.

comma <- ([,] ([ ])+ &(letter))

comma2 <- (([,])? ([ ])+ &(letter) &(caprule))

End of utterance. This includes the possibility of # followed by an entire new utterance.

end <- ((([ ])* '#' ([ ])+ utterance) / (([ ])+ !(.)) / !(.))

Terminal punctuation. This may include an inverse vocative.

period <- (([!.:;?] (&(end) / (([ ])+ &(letter)))) ((invvoc (period)?))?)

Classification and sequencing of sounds

Vowels regular and irregular.

V1 <- [AEIOUYaeiouy]

Regular vowels.

V2 <- [AEIOUaeiou]


C1 <- (!(V1) letter)

Pairs of vowels which can be monosyllables.

Mono <- (([Aa] [o]) / ((V2 [i]) !([i])) / ([Ii] !([i]) V2) / ([Uu] V2))

Pairs of vowels which must be monosyllables -- note that these are not followed by instances of their final vowel, when the final vowel is i. ao-o and ao-u are allowed, but for example aii breaks down a-ii.

EMono <- (([Aa] [o]) / (([AEOaeo] [i]) !([i])))

This is the rule which chooses the next vowel segment from a stream of vowels.

NextVowels <- (EMono / (V2 &(EMono)) / Mono / V2)

This is an obligatory monosyllable broken by a juncture (hyphen or stress marker).

BrokenMono <- (([a] juncture [o]) / ([aeo] juncture [i]))

This is a CVV single syllable.

CVVSyll <- (C1 Mono)

This is a phonetic component of a cmapua (other than VV components, which do not occur with these): CvvV, CVV, and CV units. Notice that the CVV units may be explicitly disyllables, but may not contain an obligatory monosyllable broken with a juncture.

Cvv-V units did not exist in 1989 Loglan but were described as a possibility in the Notebook 3 description of compound cmapua, though they did occur accidentally in acronyms as a side effect of VCV letterals.

LWunit <- (((CVVSyll (juncture)? V2) / (C1 !(BrokenMono) V2 (juncture)? V2) / (C1 V2)) (juncture2)?)

This expresses the Loglan capitalization rule. Only the first of a string of characters satisfying this rule can be capitalized, with certain exceptions: a TAI0 letteral word may be capitalized, as may a vowel after a lower case z: these allow implementation of capitalization conventions for acronyms and also for certain uses of letterals as pronouns attested in existing text. The first letter after a juncture may again be capitalized. The scope of the capitalization rule ends at the first non-letter other than a juncture immediately followed by a letter.

caprule <- ((uppercase / lowercase) ((('z' V1) / lowercase / (juncture (caprule)?) / TAI0))* !(letter))

The permissible initial pairs of consonants.

InitialCC <- ('bl' / 'br' / 'ck' / 'cl' / 'cm' / 'cn' / 'cp' / 'cr' / 'ct' / 'dj' / 'dr' / 'dz' / 'fl' / 'fr' / 'gl' / 'gr' / 'jm' / 'kl' / 'kr' / 'mr' / 'pl' / 'pr' / 'sk' / 'sl' / 'sm' / 'sn' / 'sp' / 'sr' / 'st' / 'tc' / 'tr' / 'ts' / 'vl' / 'vr' / 'zb' / 'zv' / 'zl' / 'sv' / 'Bl' / 'Br' / 'Ck' / 'Cl' / 'Cm' / 'Cn' / 'Cp' / 'Cr' / 'Ct' / 'Dj' / 'Dr' / 'Dz' / 'Fl' / 'Fr' / 'Gl' / 'Gr' / 'Jm' / 'Kl' / 'Kr' / 'Mr' / 'Pl' / 'Pr' / 'Sk' / 'Sl' / 'Sm' / 'Sn' / 'Sp' / 'Sr' / 'St' / 'Tc' / 'Tr' / 'Ts' / 'Vl' / 'Vr' / 'Zb' / 'Zv' / 'Zl' / 'Sv')

A permissible initial pair possibly broken by a juncture. This class is needed to test various conditions in the phonology.

MaybeInitialCC <- (([Bb] (juncture)? [l]) / ([Bb] (juncture)? [r]) / ([Cc] (juncture)? [k]) / ([Cc] (juncture)? [l]) / ([Cc] (juncture)? [m]) / ([Cc] (juncture)? [n]) / ([Cc] (juncture)? [p]) / ([Cc] (juncture)? [r]) / ([Cc] (juncture)? [t]) / ([Dd] (juncture)? [j]) / ([Dd] (juncture)? [r]) / ([Dd] (juncture)? [z]) / ([Ff] (juncture)? [l]) / ([Ff] (juncture)? [r]) / ([Gg] (juncture)? [l]) / ([Gg] (juncture)? [r]) / ([Jj] (juncture)? [m]) / ([Kk] (juncture)? [l]) / ([Kk] (juncture)? [r]) / ([Mm] (juncture)? [r]) / ([Pp] (juncture)? [l]) / ([Pp] (juncture)? [r]) / ([Ss] (juncture)? [k]) / ([Ss] (juncture)? [l]) / ([Ss] (juncture)? [m]) / ([Ss] (juncture)? [n]) / ([Ss] (juncture)? [p]) / ([Ss] (juncture)? [r]) / ([Ss] (juncture)? [t]) / ([Tt] (juncture)? [c]) / ([Tt] (juncture)? [r]) / ([Tt] (juncture)? [s]) / ([Vv] (juncture)? [l]) / ([Vv] (juncture)? [r]) / ([Zz] (juncture)? [b]) / ([Zz] (juncture)? [v]) / ([Zz] (juncture)? [l]) / ([Ss] (juncture)? [v]))

The forbidden medial pairs of consonants.

NonmedialCC <- (([b] (juncture)? [b]) / ([c] (juncture)? [c]) / ([d] (juncture)? [d]) / ([f] (juncture)? [f]) / ([g] (juncture)? [g]) / ([h] (juncture)? [h]) / ([j] (juncture)? [j]) / ([k] (juncture)? [k]) / ([l] (juncture)? [l]) / ([m] (juncture)? [m]) / ([n] (juncture)? [n]) / ([p] (juncture)? [p]) / ([q] (juncture)? [q]) / ([r] (juncture)? [r]) / ([s] (juncture)? [s]) / ([t] (juncture)? [t]) / ([v] (juncture)? [v]) / ([z] (juncture)? [z]) / ([h] (juncture)? C1) / ([cjsz] (juncture)? [cjsz]) / ([f] (juncture)? [v]) / ([k] (juncture)? [g]) / ([p] (juncture)? [b]) / ([t] (juncture)? [d]) / ([fkpt] (juncture)? [jz]) / ([b] (juncture)? [j]) / ([s] (juncture)? [b]))

The forbidden medial triples of consonants.

NonjointCCC <- (([c] (juncture)? [d] (juncture)? [z]) / ([c] (juncture)? [v] (juncture)? [l]) / ([n] (juncture)? [d] (juncture)? [j]) / ([n] (juncture)? [d] (juncture)? [z]) / ([d] (juncture)? [c] (juncture)? [m]) / ([d] (juncture)? [c] (juncture)? [t]) / ([d] (juncture)? [t] (juncture)? [s]) / ([p] (juncture)? [d] (juncture)? [z]) / ([g] (juncture)? [t] (juncture)? [s]) / ([g] (juncture)? [z] (juncture)? [b]) / ([s] (juncture)? [v] (juncture)? [l]) / ([j] (juncture)? [d] (juncture)? [j]) / ([j] (juncture)? [t] (juncture)? [c]) / ([j] (juncture)? [t] (juncture)? [s]) / ([j] (juncture)? [v] (juncture)? [r]) / ([t] (juncture)? [v] (juncture)? [l]) / ([k] (juncture)? [d] (juncture)? [z]) / ([v] (juncture)? [t] (juncture)? [s]) / ([m] (juncture)? [z] (juncture)? [b]))

A sequence of vowels of odd length, needed for certain tests.

Oddvowel <- ((juncture)? (((V2 (juncture)? V2 (juncture)?))* V2) (juncture)?)

Repeated vowels which force a stress on one of the two syllables formed. In the case of i and u, the stress rule is only triggered if an explicit juncture is present, as these are possibly monosyllabic pairs.

RepeatedVowel <- (([Aa] (juncture)? [a]) / ([Ee] (juncture)? [e]) / ([Oo] (juncture)? [o]) / ([Ii] juncture [i]) / ([Uu] juncture [u]))

Repeated continuants indicating syllabic ("vocalic") pronunciation.

RepeatedVocalic <- (([Mm] [m]) / ([Nn] [n]) / ([Ll] [l]) / ([Rr] [r]))

Single continuant consonants.

Syllabic <- [lmnr]

Single non-continuant consonants.

Nonsyllabic <- (!(Syllabic) C1)

A pair consisting of a non-continuant followed by a continuant: this is forbidden to be a pair of final consonants at the end of a syllable. There is some logic here to exclude pairs of consonants which cannot be pairs of final consonants at all: the pair will not be followed by a vowel segment, and it will not overlap a doubled continuant or the permissible initial mr.

Badfinalpair <- (Nonsyllabic !('mr') !(RepeatedVocalic) Syllabic !((V2 / [y] / RepeatedVocalic)))

The initial consonants in a syllable. The first version appears in borrowed predicates, the second in names. This is a segment of one to three consonants, each adjacent pair being a permissible initial, not sharing a letter with a syllabic doubled continuant, and (in a predicate) not followed by y, and of course not followed by a juncture.

FirstConsonants <- (((!((C1 C1 RepeatedVocalic)) &(InitialCC) (C1 InitialCC)) / (!((C1 RepeatedVocalic)) InitialCC) / ((!(RepeatedVocalic) C1) !([y]))) !(juncture))

FirstConsonants2 <- (((!((C1 C1 RepeatedVocalic)) &(InitialCC) (C1 InitialCC)) / (!((C1 RepeatedVocalic)) InitialCC) / (!(RepeatedVocalic) C1)) !(juncture))

The vowel segment in a syllable, the first version appearing in borrowed predicates and the second in names. In a predicate, this is either an instance of NextVowels (the appropriate next vowel or pair of vowels chosen from a stream of regular vowels, not followed by a syllabic doubled continuant, or a syllabic doubled continuant not followed by the same continuant. In a name, an instance of NextVowels may be followed by a syllabic doubled continuant.

VowelSegment <- ((NextVowels !(RepeatedVocalic)) / (!((C1 RepeatedVocalic)) RepeatedVocalic))

VowelSegment2 <- (NextVowels / (!((C1 RepeatedVocalic)) RepeatedVocalic))

Borrowed Predicates and Name Words

The Loglan syllable, in the version found in borrowed predicates. The distinction between SyllableA and SyllableB is a subtlety. SyllableA starts with CV followed by a consonant (in the same or the next syllable), and zero, one or two final consonants (not making up a bad final pair) with the second one forbidden to stand at the beginning of a legal syllable; SyllableB takes the general form of an (optional) initial group of consonants followed by a vowel segment (which does not overlap a doubled vowel forcing a stress) followed by zero, one or two final consonants, not forming a bad final pair, and neither of them standing at the beginning of a legal syllable.

The general idea is that one starts a new syllable as soon as possible, in the absence of explicit junctures, with the exception that in the situation CVcc (cc being a permissible initial) one will by preference place the break thus: CVc-c (and similarly if the cc is replaced by an initial triple).

SyllableA <- ((C1 V2 &(C1) !(Badfinalpair) (FinalConsonant)? ((!(Syllable) FinalConsonant))?) (juncture)?)

SyllableB <- ((FirstConsonants)? !(RepeatedVowel) !((&(Mono) V2 RepeatedVowel)) VowelSegment !(Badfinalpair) ((!(Syllable) FinalConsonant))? ((!(Syllable) FinalConsonant))? (juncture)?)

Syllable <- (SyllableA / SyllableB)

Here is a permissible initial actually broken by an explicit juncture.

BrokenInitialCC <- (&(MaybeInitialCC) C1 juncture C1 &(V2))

The class JunctureFix describes some configurations of explicit junctures between consonants which are forbidden. They are mostly situations in which one is not allowed to break a permissible initial with an explicit syllable break, but one describes an impermissible way to avoid such a break. The purpose is to make it impossible to construct a legal borrowing by maliciously moving a syllable break in an illegal complex predicate (or for that matter by moving a syllable break to an inappropriate place in a legal complex predicate). Such a maneuver can be shown always to create one of these bad configurations. Some of these configurations can occur in legal complexes.

JunctureFix <- ((InitialCC V2 BrokenInitialCC) / (((C1 V2))? V2 BrokenInitialCC) / (C1 V2 !(stress) juncture InitialCC V2 Letter) / (C1 BrokenInitialCC V2))

Here is a menagerie of syllables which are or may be final in a borrowed predicate or irregular djifoa. All have regular vowel segments. The first is unstressed and followed by a consonant which starts a syllable, or by y, or by a character which is not a letter or juncture. The second is as the first but definitely followed by y or a character which is not a letter or juncture. The last two are definitely final in a borrowing djifoa, being followed by y, and being permitted to be stressed (the last one definitely is stressed).

SyllableFinal1 <- ((FirstConsonants)? !(RepeatedVocalic) VowelSegment !(stress) (juncture)? !(V2) (&(Syllable) / &([y]) / !(Letter)))

SyllableFinal2 <- ((FirstConsonants)? !(RepeatedVocalic) VowelSegment !(stress) (juncture)? (&([y]) / !(Letter)))

SyllableFinal2a <- ((FirstConsonants)? !(RepeatedVocalic) VowelSegment (juncture)? &([y]))

SyllableFinal2b <- ((FirstConsonants)? !(RepeatedVocalic) VowelSegment stress &([y]))

This is an explicitly stressed syllable (with its vowel segment not intersecting a doubled vowel forcing a stress).

Note that a final syllable cannot be followed by a regular vowel (mod intervening juncture). This is part of the arrangements for having to pause before vowel-initial words.

StressedSyllable <- (((FirstConsonants)? !(RepeatedVowel) !((&(Mono) V2 RepeatedVowel)) VowelSegment !(Badfinalpair) (FinalConsonant)? (FinalConsonant)?) stress)

These are final consonants in a Loglan syllable. This is a consonant not participating in a doubled continuant, not starting a forbidden medial pair or triple of consonants, and not followed by a vowel (with or without an intervening juncture). Vowel-initial syllables, when not initial, are always preceded immediately by vowel-final syllables.

FinalConsonant <- (!(RepeatedVocalic) !(NonmedialCC) !(NonjointCCC) C1 !(((juncture)? V2)))

Here is the form of the syllable which can appear in names. Note the absence of the strictures against the vowel segment overlapping a doubled vowel forcing a stress, and the possibility of y instead of a VowelSegment. Also note that neither final consonant may stand at the head of a syllable in a name: syllables end as soon as possible without exception.

Syllable2 <- (((FirstConsonants2)? (VowelSegment2 / [y]) !(Badfinalpair) ((!(Syllable2) FinalConsonant))? ((!(Syllable2) FinalConsonant))?) (juncture)?)

The usual consonant-final name words. A name must resolve into the form of syllable just given and it must end with a consonant. It must be followed by (and parsed as including) an explicit pause unless it is followed by end of text, terminal punctuation, another name word, or the little word ci. A final stress mark may appear between the final consonant and the comma pause.

Name <- (([ ])* &(((uppercase / lowercase) ((!((C1 (stress)? !(Letter))) Lowercase))* C1 (stress)? !(Letter) (&(end) / comma / &(period) / &(Name) / &(CI)))) ((Syllable2)+ (&(end) / comma / &(period) / &(Name) / &(CI))))

We commence the construction of borrowed predicates and irregular (borrowing) djifoa. The next class identifies syllables with a continuant vowel segment of the kind which can occur in borrowings or irregular djifoa.

CCSyllableB <- (((FirstConsonants)? RepeatedVocalic !(Badfinalpair) ((!(Syllable) FinalConsonant))? ((!(Syllable) FinalConsonant))?) (juncture)?)

This is the final pair or triple of syllables in a borrowing, starting with the stressed syllable, followed optionally by an unstressed continuant syllable, followed by one of the final syllable forms from above. If the stressed syllable is not explicitly stressed, the syllable must be followed by a character other than a letter or juncture (usually a space or punctuation), as in this case we must deduce the presence of the stressed syllable from the word break; if the stressed syllable is explicitly stressed, no such explicit word break is required.

BorrowingTail <- ((!(JunctureFix) !(CCSyllableB) StressedSyllable ((!(StressedSyllable) CCSyllableB))? !(StressedSyllable) SyllableFinal1) / (!(CCSyllableB) !(JunctureFix) Syllable ((!(StressedSyllable) CCSyllableB))? !(StressedSyllable) SyllableFinal2))

This is a first approximation to the borrowing class: a sequence of syllables which are not stressed and not initial in a borrowing tail, with no two continuant syllables adjacent, followed by a borrowing tail. Additional conditions must be imposed on this class to get a borrowing.

PreBorrowing <- (((!(BorrowingTail) !(StressedSyllable) !(JunctureFix) !((CCSyllableB CCSyllableB)) Syllable))* !(CCSyllableB) BorrowingTail)

This class encodes the condition that a borrowing predicate must contain a CC pair. A borrowing may begin with a consonant followed by an indeterminate length stream of unstressed vowels (or just an indeterminate length stream of unstressed vowels); if it does, what follows will begin with CC (possibly broken by a juncture) and cannot be a well-formed borrowing by itself (so the initial (C)Vn doesn't fall off) nor can it be a permissible initial broken by an explicit juncture which would start a borrowing if the juncture were dropped; if a borrowing does not begin with such a (C)Vn it begins CC.

HasCCPair <- ((((C1)? ((V2 ((!(stress) juncture))?))+ !(Borrowing) !((&(MaybeInitialCC) C1 (!(stress) juncture) !(CCVV) PreBorrowing)) (stress)?))? C1 (juncture)? C1)

A borrowing does not begin CVcc with the cc being a permissible initial (possibly broken with a juncture) and the second c starting a pre-complex or complex tail (something resolvable into djifoa); the initial CV would fall off. Such a form is resolvable into djifoa, but the initial CVc is required to be y-hyphenated; this is part of the elimination of the slinkui test effected in the 90's.

CVCBreak <- (C1 V2 (juncture)? &(MaybeInitialCC) C1 (juncture)? &((PreComplex / ComplexTail)))

CCVV and CCCVV predicates are forbidden. They would cause various technical problems.

CCVV <- ((&(BorrowingTail) C1 C1 (C1)? V2 stress !(Mono) V2) / (&(BorrowingTail) C1 C1 (C1)? V2 (juncture)? V2 (!(Letter) / ((juncture)? [y]))))

And now the borrowing class can be defined. A borrowing is a pre-borrowing which satisfies the CC pair condition as formalized above, is not a CVCbreak or CC(C)VV predicate, does not start with a continuant syllable, and does not begin ((C)V[V])VccV where cc is a permissible initial, even if the cc is broken by a juncture (there are predicates beginning with CVVccV). The difficulty with ((C)V[V])VccV is that the initial V or CV[V] would fall off an initial borrowing affix ((C)V[V])VccVy if this were the entire word (what would be left would be a ccVy djifoa); longer words of this shape would usually have the initial (C)Vn fall off in any case. A word of the shape CVVccV is a complex predicate; some longer words beginning with this are allowed.

Borrowing <- (&(HasCCPair) !(CVCBreak) !(CCVV) !(((((C1)? (V2 (juncture)?) ((V2 (juncture)? &(V2)))+))? V2 (juncture)? MaybeInitialCC V2)) !(CCSyllableB) (((!(BorrowingTail) !(StressedSyllable) !((CCSyllableB CCSyllableB)) !(JunctureFix) Syllable))* !(CCSyllableB) BorrowingTail))

The definition of a irregular (borrowing) djifoa follows. The basic idea is that a borrowing djifoa is a borrowing with no explicit stresses present, with y appended, with, optionally, a stress marker preceding the y (thus stressing the last, normally unstressed, syllable of the original borrowing). Note that a borrowing djifoa may be followed by a pause (in which case it is always stressed on the syllable preceding the y, whether this is explicitly indicated or not); the special class of stressed borrowing djifoa here is not followed by a pause.

PreBorrowingAffix <- ((((!(StressedSyllable) !(SyllableFinal2a) !((CCSyllableB CCSyllableB)) !(JunctureFix) Syllable))+ SyllableFinal2a) (juncture)? [y] !(stress) (juncture)? (([ ,] ([ ])*))?)

BorrowingAffix <- (&(HasCCPair) !(CVCBreak) !(CCVV) !(((((C1)? (V2 (juncture)?) ((V2 (juncture)? &(V2)))+))? V2 (juncture)? MaybeInitialCC V2)) !(CCSyllableB) (((!(StressedSyllable) !(SyllableFinal2a) !((CCSyllableB CCSyllableB)) !(JunctureFix) Syllable))+ SyllableFinal2a) (juncture)? [y] !(stress) (juncture)? (comma)?)

StressedBorrowingAffix <- (&(HasCCPair) !(CVCBreak) !(CCVV) !(((((C1)? (V2 (juncture)?) ((V2 (juncture)? &(V2)))+))? V2 (juncture)? MaybeInitialCC V2)) !(CCSyllableB) (((!(StressedSyllable) !(SyllableFinal2a) !((CCSyllableB CCSyllableB)) !(JunctureFix) Syllable))* SyllableFinal2b) (juncture)? [y] !(stress) (juncture)? !([,]))

Complex Predicates

Here is the y-hyphen to attach to djifoa to avoid phonetic problems. A y-hyphen is never stressed. It is never final in a word (so it is followed by a letter with optional intervening juncture). It is not followed by another y.

yhyphen <- ((juncture)? [y] !(stress) (juncture)? !([y]) &(letter))

This is an unstressed CV syllable suitable to be the final syllable of a five-letter djifoa: it is not stressed and it is not followed by a vowel (with or without intervening juncture), which is part of the arrangements for being forced to pause before vowel-initial words.

CV <- (C1 V2 !(stress) (juncture)? !(V2))

This is the final consonant in a CVC djifoa. It may take the form Cy (if the djifoa is "hyphenated"). Otherwise it is not initial in a forbidden pair or triple of consonants and it is not followed by a vowel (with or without intervening juncture).

4/27 revised the grammar to add optional juncture at the beginning of Cfinal so that for example mekykiu can be me'kykiu as well as mek'ykiu. This is important relative to avoiding the allophone of h. We do want to prevent a juncture appearing both before and after the consonant!

Cfinal <- ((((juncture &((C1 !(juncture)))))? C1 yhyphen) / (!(NonmedialCC) !(NonjointCCC) C1 !(((juncture)? V2))))

This is the general form of a "hyphen" averting phonetic problems with a djifoa. It is either r (which may not be followed by r, with or without intervening juncture) or n (which will be followed by r, with or without intervening juncture, or y. A "hyphen" will be followed by a consonant, with or without an intervening juncture.

A typical use of such a "hyphen" is to attach a CVV djifoa to the beginning of a predicate so that it will not "fall off".

hyphen <- (!(NonmedialCC) !(NonjointCCC) (([r] !(((juncture)? [r])) !(((juncture)? V2))) / ([n] (juncture)? &([r])) / ((juncture)? [y] !(stress))) ((juncture)? &(letter)) !(((juncture)? [y])))

This is a "hyphen" of the shape n or r.

noyhyphen <- (!(NonmedialCC) !(NonjointCCC) (([r] !(((juncture)? [r])) !(((juncture)? V2))) / ([n] (juncture)? &([r]))) &(((juncture)? &(letter))) !(((juncture)? [y])))

This is a class of stressed syllables suitable for complexes.

StressedSyllable2 <- (((FirstConsonants)? VowelSegment !(Badfinalpair) (FinalConsonant)? (FinalConsonant)?) stress (yhyphen)?)

Varieties of CVV djifoa follow. CVVStressed is a CVV djifoa which is definitely stressed and can be stressed on the second syllable (the first case is that of a doubled vowel, where there is an option about how the stress could be placed). CVVStressed2 is a stressed monosyllabic CVV djifoa. CVV is the general class of these djifoa.

CVVStressed <- (((C1 &(RepeatedVowel) V2 !(stress) (juncture)? !(RepeatedVowel) V2 (noyhyphen)?) (juncture)? (yhyphen)?) / (C1 !(BrokenMono) V2 !(stress) juncture V2 (noyhyphen)? stress (yhyphen)?) / (C1 !(Mono) V2 V2 (noyhyphen)? stress (yhyphen)?))

CVVStressed2 <- (C1 Mono (noyhyphen)? stress (yhyphen)?)

CVV <- (!((C1 V2 stress V2 (hyphen)? stress)) ((C1 !(BrokenMono) V2 (juncture)? !(RepeatedVowel) V2 (noyhyphen)?) (juncture)? !(V2) (yhyphen)?))

These are classes of CVV djifoa which are or can be final in a complex, except for the last one which has technical uses (it is followed by a y, which will not happen at the end of a complex).

Note that none of these can be followed by a regular vowel (mod intervening juncture); this is part of the arrangements for having to pause before vowel-initial words.

CVVFinal1 <- (C1 !(BrokenMono) V2 stress !(RepeatedVowel) V2 !(stress) (juncture)? !(V2))

CVVFinal2 <- (((C1 !(Mono) V2 V2) / (C1 !(BrokenMono) V2 juncture !(RepeatedVowel) V2)) !(Letter))

CVVFinal3 <- (C1 &(Mono) V2 V2 !(stress) (juncture)? !(V2))

CVVFinal4 <- (C1 Mono !(Letter))

CVVFinal5 <- (((C1 !(Mono) V2 V2) / (C1 !(BrokenMono) V2 juncture V2)) &(((juncture)? [y])))

Varieties of CVC djifoa. The first is general; the second is stressed.

NOTE: the change made above in Cfinal motivated an additional form for CVCStressed.

CVC djifoa cannot be final in a complex, so no final forms are given.

CVC <- ((C1 V2 Cfinal) (juncture)?)

CVCStressed <- ((C1 V2 !(NonmedialCC) !(NonjointCCC) C1 stress !(V2) (yhyphen)?) / (C1 V2 stress C1 !(juncture) yhyphen))

Varieties of CCV djifoa. A basic form and an explicitly stressed form are given.

CCV <- (InitialCC !(RepeatedVowel) V2 (juncture)? !(V2) (yhyphen)?)

CCVStressed <- (InitialCC !(RepeatedVowel) V2 stress !(V2) (yhyphen)?)

Forms of CCV djifoa which are or can be final. As above, note that none of these can be followed by a regular vowel.

CCVFinal1 <- (InitialCC !(RepeatedVowel) V2 !(stress) (juncture)? !(V2))

CCVFinal2 <- (InitialCC V2 !(Letter))

Forms of CCVCV djifoa.

CCVCVMedial <- (InitialCC V2 (juncture)? C1 [y] !(stress) (juncture)? &(letter))

CCVCVMedialStressed <- (CCV stress C1 [y] !(stress) (juncture)? &(letter))

Forms of CCVCV predicates final in a complex (and a form followed by y)

CCVCVFinal1 <- (InitialCC V2 stress CV)

CCVCVFinal2 <- (InitialCC V2 (juncture)? CV !(Letter))

CCVCVY <- (InitialCC V2 (juncture)? CV [y])

Forms of CVCCV djifoa. The stressed form has two alternatives because the syllable break can be in different places.

CVCCVMedial <- (C1 V2 ((juncture &(InitialCC)))? !(NonmedialCC) C1 (juncture)? C1 [y] !(stress) (juncture)? &(letter))

CVCCVMedialStressed <- ((C1 V2 (stress &(InitialCC)) !(NonmedialCC) C1 C1 [y] !(stress) (juncture)? &(letter)) / (C1 V2 !(NonmedialCC) C1 stress C1 [y] !(stress) (juncture)? &(letter)))

Forms of CVCCV predicates final in a complex, and forms with an appended y. Note that none of the five letter predicate forms can be followed by a vowel, mod intervening juncture.

CVCCVFinal1a <- (C1 V2 stress InitialCC V2 !(stress) (juncture)? !(V2))

CVCCVYa <- (C1 V2 (juncture)? InitialCC V2 !(stress) (juncture)? [y])

CVCCVFinal1b <- (C1 V2 !(NonmedialCC) C1 stress CV)

CVCCVYb <- (C1 V2 !(NonmedialCC) C1 (juncture)? CV [y])

CVCCVFinal2 <- (C1 V2 ((juncture &(InitialCC)))? !(NonmedialCC) C1 (juncture)? CV !(Letter))

The five letter predicate forms with appended y. This form and all of its subforms exist only to be excluded: this is not a possible form for a borrowing djifoa.

FiveLetterY <- (CCVCVY / CVCCVYa / CVCCVYb)

Classes of final djifoa of interest.

GenericFinal <- (CVVFinal3 / CVVFinal4 / CCVFinal1 / CCVFinal2)

FiveLetterFinal <- (CCVCVFinal1 / CCVCVFinal2 / CVCCVFinal1a / CVCCVFinal1b / CVCCVFinal2)

GenericTerminalFinal <- (CVVFinal4 / CCVFinal2)

Regular djifoa.

Affix1 <- (CCVCVMedial / CVCCVMedial / CCV / CVV / CVC)

The classes Peelable and Peelable2 are cunning pieces of PEG programming. They represent fake regular djifoa which are actually initial segments of borrowing djifoa and borrowings proper, respectively.

Peelable <- (&(PreBorrowingAffix) !(CVVFinal1) !(CVVFinal5) Affix1 (!(Affix1) / &((&(PreBorrowingAffix) !(CVVFinal1) !(CVVFinal5) Affix1 !(PreBorrowingAffix) !(Affix1))) / Peelable))

Peelable2 <- (&(PreBorrowing) !(CVVFinal1) !(CVVFinal2) !(CVVFinal5) !(FiveLetterFinal) Affix1 !(FiveLetterFinal) (!(Affix1) / &((&(PreBorrowing) !(FiveLetterFinal) !(CVVFinal1) !(CVVFinal2) !(CVVFinal5) Affix1 !(PreBorrowing) !(FiveLetterFinal) !(Affix1))) / Peelable2))

The general class of djifoa: regular djifoa which are not fake in the sense just indicated, or borrowing djifoa other than the illegal five letter forms.

Affix <- ((!(Peelable) !(Peelable2) Affix1) / (!(FiveLetterY) BorrowingAffix))

Djifoa which are either regular djifoa containing no stresses or borrowing djifoa.

Affix2 <- (!(StressedSyllable2) !(CVVStressed) Affix)

This is the final djifoa or two of a complex, containing the stress and long enough to be a complex itself. The logic could bear extensive commentary.

ComplexTail <- ((Affix GenericTerminalFinal) / (!((!(Peelable) Affix1)) !(FiveLetterY) StressedBorrowingAffix GenericFinal) / (CCVCVMedialStressed GenericFinal) / (CVCCVMedialStressed GenericFinal) / (CCVStressed GenericFinal) / (CVCStressed GenericFinal) / (CVVStressed GenericFinal) / (CVVStressed2 GenericFinal) / (Affix2 CVVFinal1) / (Affix2 CVVFinal2) / CCVCVFinal1 / CCVCVFinal2 / CVCCVFinal1a / CVCCVFinal1b / CVCCVFinal2 / (!((CVVStressed / StressedSyllable2)) Affix !((!(Peelable2) Affix1)) Borrowing !(((juncture)? [y]))))

The primitive five-letter predicates.

Primitive <- (CCVCVFinal1 / CCVCVFinal2 / CVCCVFinal1a / CVCCVFinal1b / CVCCVFinal2)

An initial approximation to the class of complex predicates: a possibly empty chain of affixes, none of them stressed or the start of a complex tail, followed by a complex tail.

PreComplex <- (ComplexTail / ((!((CVCStressed / CCVStressed / CVVStressed / ComplexTail / StressedSyllable2)) Affix) PreComplex))

The class of complex predicates. This is a precomplex which does not begin with an unhyphenated CVV followed by another CVV, nor with any other CVV which would fall off the front, nor with CVcc (with or without a juncture between the cc's where cc is a permissible initial (mod possible juncture) and the second c starts a precomplex or complex tail: a CVC djifoa followed by a djifoa which would make a permissible initial pair must be y-hyphenated (this is how the slinkui test was eliminated).

Complex <- (!((C1 V2 (juncture)? (V2)? (juncture)? CVV)) !((C1 V2 !(stress) (juncture)? (V2)? !(stress) (juncture)? (Primitive / PreComplex / Borrowing / CVV))) !((C1 V2 (juncture)? &(MaybeInitialCC) C1 (juncture)? &((PreComplex / ComplexTail)))) PreComplex)

A predicate is a primitive, complex or borrowing (notice that primitive is tried first, then complex, then borrowing); in addition, such a word (or a consonant-initial unit cmapua) can be linked with zao to a predicate word to form a new predicate. The zao construction means exactly the same thing as concatenation of djifoa (this construction is a proposal of John Cowan; it is an alternative to use of borrowing djifoa, for example).

Predicate <- (((&(caprule) ((Primitive / Complex / Borrowing) ((([ ])* Z AO (', ')? ([ ])* Predicate))?)) / (C1 V2 (V2)? ([ ])* Z AO (comma)? ([ ])* Predicate)) !(((juncture)? [y])))

Phonology of syllables in cmapua

A consonant followed by four vowels cannot occur in a cmapua. This prevents a CVV cmapua from being followed by a VV cmapua without pause, part of the mechanism to force pauses before vowel initial words.

Fourvowels <- (C1 V2 (juncture)? V2 (juncture)? V2 (juncture)? V2)

Initial consonants in unit cmapua: they do not start predicates, nor are they followed by four vowels.

B <- (!(Predicate) !(Fourvowels) [Bb])

C <- (!(Predicate) !(Fourvowels) [Cc])

D <- (!(Predicate) !(Fourvowels) [Dd])

F <- (!(Predicate) !(Fourvowels) [Ff])

G <- (!(Predicate) !(Fourvowels) [Gg])

H <- (!(Predicate) !(Fourvowels) [Hh])

J <- (!(Predicate) !(Fourvowels) [Jj])

K <- (!(Predicate) !(Fourvowels) [Kk])

L <- (!(Predicate) !(Fourvowels) [Ll])

M <- (!(Predicate) !(Fourvowels) [Mm])

N <- (!(Predicate) !(Fourvowels) [Nn])

P <- (!(Predicate) !(Fourvowels) [Pp])

R <- (!(Predicate) !(Fourvowels) [Rr])

S <- (!(Predicate) !(Fourvowels) [Ss])

T <- (!(Predicate) !(Fourvowels) [Tt])

V <- (!(Predicate) !(Fourvowels) [Vv])

Z <- (!(Predicate) !(Fourvowels) [Zz])

Lone vowels in cmapua syllables; note that the juncture following is the juncture2 class, guarded against a following predicate without a pause if it is a stress, and of course not followed by a vowel (with or without intervening juncture), which helps to prevent confusion with CVV cmapua and force pauses before vowel-initial words.

a <- ([Aa] (juncture2)? !(V2))

e <- (([Ee] (juncture2)?) !(V2))

i <- ([Ii] (juncture2)? !(V2))

o <- ([Oo] (juncture2)? !(V2))

u <- ([Uu] (juncture2)? !(V2))

Pairs of vowels in cmapua; note the use of juncture2. Such a pair of vowels is followed either by a single vowel followed by a non-vowel (in a Cvv-V situation) or an even number of vowels (nonzero only in a compound attitudinal).

The V2 non V2 option is supported for all vowel pairs; it could be restricted to monosyllables. This has nontrivial effects: it means that one does not have to pause before a legacy vowel name after a VV or CVV cmapua syllable. 4/27 the new rule V3 ensures that one must pause between a VV and a following vowel initial predicate.

There are currently two (sort of) exceptions to being forced to pause before vowel initial words: VV attitudinals can be freely chained without pause, though one must pause before the first one. One must pause before a legacy vowel name of VCV form, except when it occurs as a unit in an acronym. Neither of these are strictly an exception to the rule that one must pause before a vowel-initial word. At this point we have completed the proof, interspersed through documentation of word forms in this section, that one must pause before a vowel-initial word, with the exceptions or near-exceptions noted.

V3 <- (!(Predicate) V2)

AA <- ([Aa] (juncture)? [a] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

AE <- ([Aa] (juncture)? [e] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

AI <- ([Aa] [i] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

AO <- ([Aa] [o] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

AU <- ([Aa] (juncture)? [u] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

EA <- ([Ee] (juncture)? [a] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

EE <- ([Ee] (juncture)? [e] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

EI <- ([Ee] [i] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

EO <- ([Ee] (juncture)? [o] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

EU <- ([Ee] (juncture)? [u] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

IA <- ([Ii] (juncture)? [a] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

IE <- ([Ii] (juncture)? [e] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

II <- ([Ii] (juncture)? [i] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

IO <- ([Ii] (juncture)? [o] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

IU <- ([Ii] (juncture)? [u] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

OA <- ([Oo] (juncture)? [a] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

OE <- ([Oo] (juncture)? [e] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

OI <- ([Oo] [i] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

OO <- ([Oo] (juncture)? [o] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

OU <- ([Oo] (juncture)? [u] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

UA <- ([Uu] (juncture)? [a] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

UE <- ([Uu] (juncture)? [e] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

UI <- ([Uu] (juncture)? [i] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

UO <- ([Uu] (juncture)? [o] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))

UU <- ([Uu] (juncture)? [u] (juncture2)? (&((V3 (juncture)? !(V2))) / !(Oddvowel)))


Conditions on cmapua words

Conditions which must hold at the start of a cmapua word.

__LWinit <- (([ ])* !(Predicate) &(caprule))


Convert an explicit pause which might be significant to a form which will be ignored.

CANCELPAUSE <- (comma (('y' comma) / (C UU !(connective))))

A pause, of the sort which is recognized as a free modifier (not as an elided gu). The phonetically required pauses after name words or after stressed cmapua before consonant initial predicates are not of this form (they are part of the words they follow).

PAUSE <- (!(CANCELPAUSE) comma !(connective) !(V1))

Letter names

A simple letter name. These appear early because they have to be mentioned in a variety of rules.

These include both the traditional Vfi and Vma vowel names and the new ziV and ziVma vowel names.

I have provided full support for the legacy vowel names, but they are phonetically quite exceptional in various ways. Added the Greek legacy vowels in -zi.

TAI0 <- (!(Predicate) (((V1 (juncture)? !(Predicate) !(Name) M a (juncture2)?) / (V1 (juncture)? !(Predicate) !(Name) F i (juncture2)?) / (V1 (juncture)? !(Predicate) !(Name) Z i (juncture2)?) / (C1 AI (u)?) / (C1 EI (u)?) / (C1 EO) / (Z [i] (juncture)? V1 (juncture2)? ((M a))? (juncture2)?)) (!(Oddvowel) / (!([ ]) &(TAI0)))))

Logical connectives and utterance connectives

A general purpose negative suffix.

NOI <- (N OI)

An atomic logical connective, one of a, e, o, u, ha. This cannot be followed by a vowel, and falls under the final cmapua syllable before predicate pause rule if stressed.

A0 <- (!(Predicate) !((Mono / BrokenMono)) (([AEOUaeou] / (H a)) (juncture2)? !(V2)))

A more general class of logical connectives. This is guarded against confusion with the initial vowel in a legacy Vfi/Vma vowel name. The connective u or no-initial forms may form converses with initial nu. Forms not starting with no or nu may be prefixed with no to indicate negation of what precedes them. The A0 core follows. Optionally, this may be followed by noi to negate what follows. Optionally, this may be followed by a PA word (a tense, location or modifier word) with no internal pauses, closed with fi or an explicit pause. If a logical connective is followed by a PA word not intended as a suffix an intervening explicit comma pause is usually required. The pause closure is provided for backward compatibility with legacy text: fi is preferred going forward.

The APA forms (and related IPA forms) are a phonetic problem in 1989 Loglan. I have tried out a number of solutions: the best one seems to be to force closure of the PA component. It can be noted that JCB appears to have usually closed IPA words with an explicit comma pause in Notebook 3. Getting legacy text to parse often requires addition of commas to disambiguate APA from A PA situations.

A <- (__LWinit !(TAI0) (((N [u]) &((u / (N [o])))))? ((N [o]))? A0 (NOI)? !((([ ])+ PANOPAUSES PAUSE)) !((PANOPAUSES !(PAUSE) [ ,])) ((PANOPAUSES ((F i) / &(PAUSE))))?)

Suffixed classes of logical connectives with different precedence, closed with ci or ge (and so of course needing no closure in their internal copy of an element of the class above) use this class as a component.

ANOFI <- (__LWinit !(TAI0) (((N [u]) &((u / (N [o])))))? ((N [o]))? A0 (NOI)? (PANOPAUSES)?)

The A class packaged as a word.

A1 <- (A !(connective))

Suffixed classes of logical connectives with different precedence, closed with ci or ge.

ACI <- (ANOFI C i !(connective))

AGE <- (ANOFI G e !(connective))

Tightly binding logical connectives used between predicates. The penultimate form in this list is an internal component of a class of utterance logical connectives given below.

CA0 <- ((((N o))? ((C a) / (C e) / (C o) / (C u) / (Z e) / (C i H a))) (NOI)?)

CA1 <- ((((N u) &(((C u) / (N o)))))? ((N o))? CA0 !((([ ])+ PANOPAUSES PAUSE)) !((PANOPAUSES !(PAUSE) [ ,])) ((PANOPAUSES ((F i) / &(PAUSE))))?)

CA1NOFI <- ((((N u) &(((C u) / (N o)))))? ((N o))? CA0 (PANOPAUSES)?)

CA <- (__LWinit &(caprule) CA1 !(connective))

The mixture connective ze used between predicates is included in the class just given; this is the mixture connective used between arguments.

ZE2 <- (__LWinit (Z e) !(connective))

Assorted utterance connectives.

I <- (__LWinit !(TAI0) i !((([ ])+ PANOPAUSES PAUSE)) !((PANOPAUSES !(PAUSE) [ ,])) ((PANOPAUSES ((F i) / &(PAUSE))))? !(connective))

ICA <- (__LWinit !(Predicate) i ((H a) / CA1) !(connective))

ICI <- (__LWinit i (CA1NOFI)? C i !(connective))

IGE <- (__LWinit i (CA1NOFI)? G e !(connective))

The phonetic class of connective words, plus the vowel initial letterals (the legacy VCV vowel names). These are used to enforce the rule that one must pause before certain words -- not all of the connectives are vowel-initial, so it is not just a consequence of that phonetic rule.

The standard use of these is that most word classes end with !connective to force an explicit comma pause before these connectives. This is not simply an instance of being forced to pause before vowel-initial words, as it also applies to consonant-initial elements of these classes.

connective <- (ACI / AGE / A1 / ICI / ICA / IGE / I / (&(V1) TAI0))

Basic forethought logical connectives.

KA0 <- ((K a) / (K e) / (K o) / (K u) / (K i H a))

The causal and comparative modifiers. They appear here because they can be used to build causal and comparative forethought connectives.

The addition of comparative connectives is a novelty, but an inevitable side effect of allowing converse and negative forms of the comparative modifiers described in one of the paradigms.

KOU <- ((K OU) / (M OI) / (R AU) / (S OA) / (M OU) / (C IU))

KOU1 <- (((N u N o) / (N u) / (N o)) KOU)

The general initial forms in forethought connections.

KA <- (__LWinit &(caprule) (((((N u) &((K u))))? KA0) / ((KOU1 / KOU) K i)) (NOI)? !(connective))

The two medial forms allowed in forethought connections.

KI <- (__LWinit (K i) (NOI)? !(connective))

This identifies a causal or comparative modifier which is not the initial segment of a causal or comparative initial forethought connective.

KOU2 <- (KOU1 !(KI))

Numerals and quantifiers

A rule identifying a misplaced stress in a numerical predicate.

BadNIStress <- ((C1 V2 (V2)? stress ((M a))? ((M OA))? NI RA) / (C1 V2 stress V2 ((M a))? ((M OA))? NI RA))

Atomic quantity words, other than the quantifiers which double as suffixes.

NI0 <- (!(BadNIStress) ((K UA) / (G IE) / (G IU) / (H IE) / (H IU) / (K UE) / (N EA) / (N IO) / (P EA) / (P IO) / (S UU) / (S UA) / (T IA) / (Z OA) / (Z OO) / (H o) / (N i) / (N e) / (T o) / (T e) / (F o) / (F e) / (V o) / (V e) / (P i) / (R e) / (R u) / (S e) / (S o) / (H i)))

The quantifier prefixes.

ie is the interrrogative "which". Note that it is allowed to be followed by a pause (phonetics forbid it to be part of a "word" with the other components of a NI word).

SA <- (!(BadNIStress) ((S a) / (S i) / (S u) / (IE (((comma2)? !(IE) SA))?)) (NOI)?)

The quantifier words which double as predicate forming suffixes.

RA <- (!(BadNIStress) ((R a) / (R i) / (R o)))

ma, standing for 00 and moa (replacing 1989 Loglan mo) are removed from the pool of atomic quantifier words and are used only as affixes. Class NI1 consists of an atomic quantity word followed optionally by ma followed optionally by moa (where moa is optionally suffixed with a sequence of digits to tell you how many blocks 000 are to be appended).

Further, such a unit may be followed by a space or comma which is to be ignored if it is further followed by a NI word which is not a numerical predicate. One may pause explicitly or implicitly when uttering a long numeral.

NI1 <- ((NI0 ((!(BadNIStress) M a))? ((!(BadNIStress) M OA (NI0)*))?) ((comma2 !((NI RA)) &(NI)))?)

Words of class RA may similarly receive affixes.

RA1 <- ((RA ((!(BadNIStress) M a))? ((!(BadNIStress) M OA (NI0)*))?) ((comma2 !((NI RA)) &(NI)))?)

This is a class of complex quantifier words. They may optionally start with a SA, followed by a string of one or more NI1's or a single RA1, or it may be a SA alone, further optionally suffixed with noi. What I have described so far I call a block: a NI2 is either a block or a string of blocks linked with CA0 logical connective units.

NI2 <- ((((SA)? ((NI1)+ / RA1)) / SA) (NOI)? ((CA0 (((SA)? ((NI1)+ / RA1)) / SA) (NOI)?))*)

A NI quantifier starts with a core NI2, followed either by cu or by mue followed by an acronym (standing for a dimension) ending with an explicit comma pause, terminal punctuation, or end of text. The use of mue and the requirement of an explicit pause after the acronym address the usual phonetic problems with isolating acronyms.

NOTE: 4/28 examination of this rule motivated moving ie into the class SA as shown above and otherwise eliminating all special rules for handling this word.

NI <- (__LWinit NI2 ((&((M UE)) Acronym (comma / &(end) / &(period)) !((C u))))? ((C u))?)

This is the previous class encapsulated as a word.

mex <- (__LWinit NI !(connective))

the little word ci

This word has a number of independent uses as a very tightly binding verbal hyphen. It is also a name marker due to its role in serial names.

CI <- (__LWinit (C i) !(connective))


This is the basic acronym construction, used to build acronyms as dimensions (seen above) and as names [not predicates as in 1989 Loglan]. mue is allowed as an initial component of an acronym. An acronym starts with an atomic letter name or a form zV (never followed by a vowel) and continues with further units which may be atomic letter names, zV units (never followed by vowels), or NI1 numeral units. There are always at least two units (mue makes one-letter or numeral-initial acronyms possible). The V units allowed in 1989 Loglan have been eliminated.

NOTE: I think the segment ([Zz] V2 (!(V2) / ([Zz] &(V2))))))+) needs repair.

Acronym <- (([ ])* &(caprule) ((M UE) / TAI0 / ([Zz] V2 !(V2))) (((comma &(Acronym) M UE) / NI1 / TAI0 / ([Zz] V2 (!(V2) / ([Zz] &(V2))))))+)


The most general form of letter names. There is an extra construction here allowing formation of foreign letter names by prefixing gao to words of quite general forms (including name words, so gao is a name marker).

TAI <- (__LWinit (TAI0 / ((G AO) !(badspaces) !(V2) ([ ])* (Name / Predicate / (C1 V2 V2 (!(Oddvowel) / &(TAI0))) / (C1 V2 (!(Oddvowel) / &(TAI0)))))) !(connective))

Atomic pronouns other than letters.

DA0 <- ((T AO) / (T IO) / (T UA) / (M IO) / (M IU) / (M UO) / (M UU) / (T OA) / (T OI) / (T OO) / (T OU) / (T UO) / (T UU) / (S UO) / (H u) / (B a) / (B e) / (B o) / (B u) / (D a) / (D e) / (D i) / (D o) / (D u) / (M i) / (T u) / (M u) / (T i) / (T a) / (M o))

Pronouns in general: these are atomic letters or atomic pronouns, optionally suffixed with ci followed by a digit. NOTE: should general letters of class TAI be allowed?

DA1 <- ((TAI0 / DA0) ((C i !([ ]) NI0))?)

The previous class encapsulated as a word.

DA <- (__LWinit DA1 !(connective))

Tenses, locations and modifiers

Atomic tense, location, and modifier words (prepositions, as it were). The causal and comparative words are included when not followed by noi or ki.

PA0 <- (((N u !(KOU)))? ((G IA) / (G UA) / (P AU) / (P IA) / (P UA) / (N IA) / (N UA) / (B IU) / (F EA) / (F IA) / (F UA) / (V IA) / (V II) / (V IU) / (C OI) / (D AU) / (D II) / (D UO) / (F OI) / (F UI) / (G AU) / (H EA) / (K AU) / (K II) / (K UI) / (L IA) / (L UI) / (M IA) / (N UI) / (P EU) / (R OI) / (R UI) / (S EA) / (S IO) / (T IE) / (V a) / (V i) / (V u) / (P a) / (N a) / (F a) / (V a) / (KOU !((N OI)) !(KI))) ((N OI))?)

A prepositional word without internal pauses except possibly next to internal logical connective units. This consists of blocks of atomic prepositional words, possibly linked with CA0 logical connective units, optionally closed with a class ZI affix.

PANOPAUSES <- (((!(PA0) NI))? ((KOU2 / PA0))+ ((((comma2)? CA0 (comma2)?) ((KOU2 / PA0))+))* (ZI)?)

The previous class encapsulated as a word. Prepositional words used as prepositions take this form.

PA3 <- (__LWinit PANOPAUSES !(connective))

Compound prepositional words admitting pauses between prepositional units. These can be used only as tenses.

PA <- (((!(PA0) NI))? ((KOU2 / PA0))+ (((((comma2)? CA0 (comma2)?) / (comma2 !(mod1a))) ((KOU2 / PA0))+))* (ZI)?)

The previous class encapsulated as a word. PA2 <- (__LWinit PA !(connective))

The little word ga used as a null tense.

GA <- (__LWinit (G a) !(connective))

The class of tenses.

PA1 <- ((PA2 / GA) !(connective))

Affixes which can be used to terminate compound prepositional words.

ZI <- ((Z i) / (Z a) / (Z u))


These are the general-purpose articles. la is also the standard name marker, but may be used as an article with predicates as well.

LE <- (__LWinit ((L EA) / (L EU) / (L OE) / (L EE) / (L AA) / (L e) / (L o) / ((L a) !(badspaces))) !(connective))

The class of article for construction of abstract descriptions. These include quantifiers.

NOTE: should possessive and relative modification of these be allowed as for LE? LEFORPO <- (__LWinit ((L e) / (L o) / NI2) !(connective))

The article for numeric descriptions.

LIO <- (__LWinit (L IO) !(connective))

Constructors for explicit ordered and unordered sets. The first two forms are initial (opening braces), the second two are final (closing braces) and the third two are medial (commas). The commas are new.

LAU <- (__LWinit (L AU) !(connective))

LOU <- (__LWinit (L OU) !(connective))

LUA <- (__LWinit (L UA) !(connective))

LUO <- (__LWinit (L UO) !(connective))

ZEIA <- (__LWinit Z EI a !(connective))

ZEIO <- (__LWinit Z EI o !(connective))

Alien text forms

The opening and closing markers for quoted Loglan text.

LI1 <- (L i)

LU1 <- (L u)

The optional markers for textual or verbal quotation. Some deprecate these.

Quotemod <- ((Z a) / (Z i))

The construction for quoted Loglan text. Complete Loglan utterances or Loglan name words can be quoted. A style in which commas appear bracketing the closing text, found in our sources, is supported but not required.

LI <- ((__LWinit LI1 !(V2) (Quotemod)? ((([,])? ([ ])+))? utterance0 (', ')? __LWinit LU1 !(connective)) / (__LWinit LI1 !(V2) (Quotemod)? comma name (comma)? __LWinit LU1 !(connective)))

The general construction for alien text to be included in Loglan text. A block of such text may include any characters in its body except for spaces; commas or terminal punctuation in final position will not be read as part of the block. It may begin or end with a comma pause (included in it by the parser). It will be followed either by spaces followed by a letter or by end of text or terminal punctuation. Alien text may include more than one block, separated by the little word y, only the last one possibly closed with a comma pause. Phonetically, the pronunciation of alien text is not prescribed by Loglan, but each block must be preceded and followed by an actual pause, whether it is written or not, and there should be no pauses internal to blocks.

This is based on a solution for handling Linnaean names with more than one block of text described in L1, except that we require that the y be written.

stringnospaces <- (([,])? (([ ])+ ((!((([,])? [ ])) !(period) .))+) ((([,])? ([ ])+ &(letter)) / &(period) / &(end)))

stringnospacesclosed <- (([,])? (([ ])+ ((!((([,])? [ ])) !(period) .))+) (([,] ([ ])+) / &(period) / &(end)))

stringnospacesclosedblock <- ((stringnospaces ((!(([y] stringnospacesclosed)) [y] stringnospaces))* ([y] stringnospacesclosed)) / stringnospacesclosed)

This is the construction for foreign names (originally Linnaean names from biology, but Steve Rice pointed out the more general usefulness of the construction): the article lao is followed by alien text.

Foreign names must be followed by pauses in speech (and at least whitespace in text), but the parser does not require a terminal comma on a foreign name.

LAO1 <- (L AO)

LAO <- (([ ])* (LAO1 stringnospaces (([y] stringnospaces))*))

This is the strong quotation construction, completely different from the 1989 Loglan construction for strong quotation. Instead, it is formally isomorphic to foreign names: the article lie is followed by a block of alien text. Notice that when quoting a text containing pauses, one must insert y between the pause-separated blocks of the quoted text.

LIE1 <- (L IE)

LIE <- (([ ])* (LIE1 (Quotemod)? stringnospaces (([y] stringnospaces))*))

The arrangements for word quotation follow. LW is the actual class of cmapua as defined in Notebook 3. It is telling that this is the only place that this class is actually needed by the parser (in quoting cmapua). In LIP it may not have been used at all, as LIP seems to allow liu to quote only actual cmapua.

Words are quoted using liu or niu (the latter may be used to signal that the word quoted is not actually a Loglan word). The quoted items must be cmapua (which must often be closed by an explicit comma pause), name words, predicate words or djifoa. Letter names can be quoted with lii.

LW <- (&(caprule) (((!(Predicate) V2 V2))+ / ((!(Predicate) (V2)? ((!(Predicate) LWunit))+) / V2)))

LIU0 <- ((L IU) / (N IU))

LIU1 <- (__LWinit ((LIU0 !(badspaces) !(V2) (Quotemod)? ((([,])? ([ ])+))? (Name / (Predicate (comma)?) / (CCV (comma)?) / (LW (([,] ([ ])+ !([,])) / &(period) / &(end) / &((([ ])* Predicate)))))) / (L II (Quotemod)? TAI !(connective))))

These are the imported foreign predicates (sao) and onomatopoeic predicates (sue) which are formed by following the appropriate particle with a single block of alien text.

SUE <- (__LWinit ((S UE) / (S AO)) stringnospaces)

Spoken punctuation of various kinds

Used as a left closure for metaphor grouping in predicate phrases.

CUI <- (__LWinit (C UI) !(connective))

The form of ga used in the "gasent" sentence construction to mark the subject.

GA2 <- (__LWinit (G a) !(connective))

A general left closure marker for many constructions.

GE <- (__LWinit (G e) !(connective))

A right closure marking for metaphor grouping in predicate phrases. Still admits the form cue found in L1 but later revised to geu

GEU <- (__LWinit ((C UE) / (G EU)) !(connective))

A marker for fronted arguments in one of the sentence forms.

GI <- (__LWinit ((G i) / (G OI)) !(connective))

A marker for delayed initial predicates in metaphor grouping.

GO <- (__LWinit (G o) !(connective))

A marker for second and further arguments appearing before the predicate (new).

GIO <- (__LWinit (G IO) !(connective))

The general spoken right closure.

GU <- (__LWinit (G u) !(connective))

spoken right closure for subordinate clauses formed with JI or JIO. distinct versions pair with distinct closure markers given below, a device to avoid the need for multiple closures, if care is taken.

GUIZA <- (__LWinit (G UI) (Z a) !(connective))

GUIZI <- (__LWinit (G UI) (Z i) !(connective))

GUIZU <- (__LWinit (G UI) (Z u) !(connective))

GUI <- (!(GUIZA) !(GUIZI) !(GUIZU) (__LWinit (G UI) !(connective)))

spoken right closures for abstract predicates and abstract and event descriptions.

The second and further ones are part of a new proposal.

GUO <- (__LWinit (G UO) !(connective))

GUOA <- (__LWinit (G UO (Z)? a) !(connective))

GUOE <- (__LWinit (G UO e) !(connective))

GUOI <- (__LWinit (G UO (Z)? i) !(connective))

GUOO <- (__LWinit (G UO o) !(connective))

GUOU <- (__LWinit (G UO (Z)? u) !(connective))

The spoken right closure for term lists.

GUU <- (__LWinit (G UU) !(connective))

New closer introduced 5/9, for arguments in certain contexts.

GUUA <- (__LWinit (G UU a) !(connective))

New closer introduced 5/9, for sentences in certain contexts.

GIUO <- (__LWinit (G IU o) !(connective))

The spoken right closure for tightly bound term constructions with JE and JUE.

GUE <- (__LWinit (G UE) !(connective))

New closer introduced 5/9, for descriptive predicates in certain contexts.

GUEA <- (__LWinit (G UE a) !(connective))

Clause formers and tightly linked argument formers

Markers for tightly bound terms.

JE <- (__LWinit (J e) !(connective))

JUE <- (__LWinit (J UE) !(connective))

Markers for subordinate clauses. New flavors have been added to match new closers, in a bid to minimize the need for multiple closures.

JIZA <- (__LWinit ((J IE) / (J AE) / (P e) / (J i) / (J a) / (N u J i)) (Z a) !(connective))

JIOZA <- (__LWinit ((J IO) / (J AO)) (Z a) !(connective))

JIZI <- (__LWinit ((J IE) / (J AE) / (P e) / (J i) / (J a) / (N u J i)) (Z i) !(connective))

JIOZI <- (__LWinit ((J IO) / (J AO)) (Z i) !(connective))

JIZU <- (__LWinit ((J IE) / (J AE) / (P e) / (J i) / (J a) / (N u J i)) (Z u) !(connective))

JIOZU <- (__LWinit ((J IO) / (J AO)) (Z u) !(connective))

JI <- (!(JIZA) !(JIZI) !(JIZU) (__LWinit ((J IE) / (J AE) / (P e) / (J i) / (J a) / (N u J i)) !(connective)))

JIO <- (!(JIOZA) !(JIOZI) !(JIOZU) (__LWinit ((J IO) / (J AO)) !(connective)))

Case tags

Case tags proper.

DIO <- (__LWinit ((B EU) / (C AU) / (D IO) / (F OA) / (K AO) / (J UI) / (N EU) / (P OU) / (G OA) / (S AU) / (V EU) / (Z UA) / (Z UE) / (Z UI) / (Z UO) / (Z UU)) !(connective))

Markers of indirect reference, which had the same grammar as case tags in 1989 Loglan but somewhat different grammar here.

LAE <- (__LWinit ((L AE) / (L UE)) !(connective))

Operators on predicates

Convert an argument to a predicate. We regard mea as entirely redundant.

ME <- (__LWinit ((M EA) / (M e)) !(connective))

Closer for ME predicates, new.

MEU <- (__LWinit M EU !(connective))

Conversion and reflexive operators (change order of arguments or identify two of them). There are two rules because a digit may be suffixed.

NU0 <- ((N UO) / (F UO) / (J UO) / (N u) / (F u) / (J u))

NU <- (__LWinit ((NU0 !((([ ])+ (NI0 / RA))) ((NI0 / RA))? (freemod)?))+ !(connective))

Operators for construction of abstract predicates and descriptions. The second and further ones are parts of a new proposal.

PO1 <- (__LWinit ((P o) / (P u) / (Z o)))

PO1A <- (__LWinit ((P OI a) / (P UI a) / (Z OI a) / (P o Z a) / (P u Z a) / (Z o Z a)))

PO1E <- (__LWinit ((P OI e) / (P UI e) / (Z OI e)))

PO1I <- (__LWinit ((P OI i) / (P UI i) / (Z OI i) / (P o Z i) / (P u Z i) / (Z o Z i)))

PO1O <- (__LWinit ((P OI o) / (P UI o) / (Z OI o)))

PO1U <- (__LWinit ((P OI u) / (P UI u) / (Z OI u) / (P o Z u) / (P u Z u) / (Z o Z u)))

Short scope forms of po and its relatives.

POSHORT1 <- (__LWinit ((P OI) / (P UI) / (Z OI)))

The above classes packaged as words.

PO <- (__LWinit PO1 !(connective))

POA <- (__LWinit PO1A !(connective))

POE <- (__LWinit PO1E !(connective))

POI <- (__LWinit PO1E !(connective))

POO <- (__LWinit PO1O !(connective))

POU <- (__LWinit PO1U !(connective))

POSHORT <- (__LWinit POSHORT1 !(connective))

Free modifier constructors

Register markers.

DIE <- (__LWinit ((D IE) / (F IE) / (K AE) / (N UE) / (R IE)) !(connective))

Vocative forming operators. These include the greeting words, for us. The word sie (sorry, with the intention of apology) is new. Unmarked vocatives (simple uses of a name word as a free modifier) are now forbidden.

HOI <- (__LWinit ((H OI) / (L OI) / (L OA) / (S IA) / (S IE) / (S IU)) !(connective))

The "scare quotes" operator. It has a numerical suffix to indicate the number of words in its scope.

JO <- (__LWinit ((NI0 / RA))? (J o) !(connective))

Spoken parentheses to create a parenthetical remark.

KIE <- (__LWinit (K IE) !(connective))

KIU <- (__LWinit (K IU) !(connective))

The spoken "smilie" constructor.

SOI <- (__LWinit (S OI) !(connective))

A list of free modifier words (attitudinals) including most of the VV words.

Notice that NOUI is phonetically irregular: a "word" like noia has the vowels grouped in an unexpected way and also has the o in a CV syllable followed by a vowel. Normally a pause would be forced; this is overridden.

UI0 <- (!(Predicate) (UA / UE / UI / UO / UU / OA / OE / OI / OU / OO / IA / II / IO / IU / EA / EE / EI / EO / EU / AA / AE / AI / AO / AU / (B EA) / (B UO) / (C EA) / (C IA) / (C OA) / (D OU) / (F AE) / (F AO) / (F EU) / (G EA) / (K UO) / (K UU) / (R EA) / (N AO) / (N IE) / (P AE) / (P IU) / (S AA) / (S UI) / (T AA) / (T OE) / (V OI) / (Z OU) / (L OI) / (L OA) / (S IA) / (S II) / (T OE) / (S IU) / (C AO) / (C EU) / (S IE) / (S EU)))

NOUI <- ((__LWinit N [o] (juncture)? ([ ])* UI0 !(connective)) / (__LWinit UI0 NOI !(connective)))

The above class and the words like , "secondly", packaged as words.

UI1 <- (__LWinit (UI0 / (NI F i)) !(connective))

The inverse vocative constructor.

HUE <- (__LWinit (H UE) !(connective))


All occurrences of the no of negation as an independent word.

NO1 <- (__LWinit !(KOU1) !(NOUI) (N o) !((__LWinit KOU)) !((([ ])* (JIO / JI / JIZA / JIOZA / JIZI / JIOZI / JIZU / JIOZU))) !(connective))

The large word classes

Name words. These include the acronymic names, as noted above.

AcronymicName <- (Acronym (&(end) / ',' / &(period) / &(Name) / &(CI)))

DJAN <- (Name / AcronymicName)

Predicate words in the grammatical sense. These include some phonetic cmapua as indicated. The identity predicates BI are a separate word class, actually. The other phonetic cmapua are either listed in LWPREDA or are numerical predicates NI RA, and are included in the class PREDA of predicate words, at the grammar level if not phonetically.

BI <- (__LWinit ((N u))? ((B IA) / (B IE) / (C IE) / (C IO) / (B IA) / (B [i])) !(connective))

LWPREDA <- ((H e) / (D UA) / (D UI) / (B UA) / (B UI))

PREDA <- (([ ])* &(caprule) (Predicate / LWPREDA / (!([ ]) NI RA)) !(connective))


Closing forms

Right closure words listed above packaged for grammatical use. They may be preceded by pauses; each right closure form may be expressed as gu as well as in its full guV (or other) form if it succeeds in closing the right sort of thing in the context where it is used. They may further be followed by free modifiers.

guoa <- ((PAUSE)? (GUOA / GU) (freemod)?)

guoe <- ((PAUSE)? (GUOE / GU) (freemod)?)

guoi <- ((PAUSE)? (GUOI / GU) (freemod)?)

guoo <- ((PAUSE)? (GUOO / GU) (freemod)?)

guou <- ((PAUSE)? (GUOU / GU) (freemod)?)

guo <- (!(guoa) !(guoi) !(guou) ((PAUSE)? (GUO / GU) (freemod)?))

guiza <- ((PAUSE)? (GUIZA / GU) (freemod)?)

guizi <- ((PAUSE)? (GUIZI / GU) (freemod)?)

guizu <- ((PAUSE)? (GUIZU / GU) (freemod)?)

gui <- ((PAUSE)? (GUI / GU) (freemod)?)

gue <- ((PAUSE)? (GUE / GU) (freemod)?)

guea <- ((PAUSE)? (GUEA / GU) (freemod)?)

guu <- ((PAUSE)? (GUU / GU) (freemod)?)

guua <- ((PAUSE)? (GUUA / GU) (freemod)?)

giuo <- ((PAUSE)? (GIUO / GU) (freemod)?)

meu <- ((PAUSE)? (MEU / GU) (freemod)?)

The next one does not take the form gu.

geu <- GEU

This packaged form of general gu does not include the alternative form PAUSE found in 1989 Loglan. Pause/gu equivalence is not implemented.

gap <- ((PAUSE)? GU (freemod)?)

Tightly bound argument lists

Arguments or modifiers following the predicate may be tightly bound to it. The basic form is je (argument or modifier), for a first term after the predicate, and jue (argument or modifier) (gue) for second and further terms after the predicate. A sequence of such arguments can optionally be closed with gue. Sequences of such arguments can be linked with forethought or afterthought logical or causal connectives.

juelink <- (JUE (freemod)? (term / (PA2 (freemod)? (gap)?)))

links1 <- (juelink (((freemod)? juelink))* (gue)?)

links <- ((links1 / (KA (freemod)? links (freemod)? KI (freemod)? links1)) (((freemod)? A1 (freemod)? links1))*)

jelink <- (JE (freemod)? (term / (PA2 (freemod)? (gap)?)))

linkargs1 <- (jelink (freemod)? ((links / gue))?)

linkargs <- ((linkargs1 / (KA (freemod)? linkargs (freemod)? KI (freemod)? linkargs1)) (((freemod)? A1 (freemod)? linkargs1))*)

Simpler predicate forms

Abstraction predicates. These predicates are built from sentences of general forms and have full syntactical privileges, which was not the case in the grammar of 1989 Loglan, in which an abstraction predicate build from a complex sentence could not participate in any construction of more complicated predicate expressions: it could not even be tensed. The abstraction predicate is optionally closed with guo. Note that there is an entire series of different matching openers and closers for abstractions which can be used to allow more supple handling of nested expressions of this kind; this is expected to be used more in the case of abstract descriptions.

abstractpred <- ((POA (freemod)? uttAx (guoa)?) / (POA (freemod)? sentence (guoa)?) / (POE (freemod)? uttAx (guoe)?) / (POE (freemod)? sentence (guoe)?) / (POI (freemod)? uttAx (guoi)?) / (POI (freemod)? sentence (guoi)?) / (POO (freemod)? uttAx (guoo)?) / (POO (freemod)? sentence (guoo)?) / (POU (freemod)? uttAx (guou)?) / (POU (freemod)? sentence (guou)?) / (PO (freemod)? uttAx (guo)?) / (PO (freemod)? sentence (guo)?))

Predicates which are in a sense atomic. These include the foreign and onomatopoeic predicates constructed from alien text, simple predicate words optionally converted by being prefixed with NU, predicates of class despredE enclosed in ge...(geu), optionally converted with prefixed NU, abstraction predicates as just described above, and expressions ME argument (MEU).

predunit1 <- ((SUE / (NU (freemod)? GE (freemod)? despredE (((freemod)? geu (comma)?))?) / (NU (freemod)? PREDA) / ((comma)? GE (freemod)? descpred (((freemod)? geu (comma)?))?) / abstractpred / (ME (freemod)? argument1 (meu)?) / PREDA) (freemod)?)

Possibly multiply negated instances of the previous class.

predunit2 <- (((NO1 (freemod)?))* predunit1)

Instances of the no of negation which are independent words and which do not negate a following predunit1.

NO2 <- (!(predunit2) NO1)

An instance of the previous class optionally followed by tightly bound arguments.

predunit3 <- ((predunit2 (freemod)? linkargs) / predunit2)

An instance of the previous class or an instance of the previous class converted to an abstraction predicate by being prefixed with short-scope POSHORT. New words poi, pui, zoi have the short scope abstraction function.

predunit <- (((POSHORT (freemod)?))? predunit3)

Forethought connected predicates of the most complex kind. These are also viewed as in a sense atomic. Note that they can be closed with GUU so as to serve as heads of metaphors.

kekpredunit <- (((NO1 (freemod)?))* KA (freemod)? predicate (freemod)? KI (freemod)? predicate (guu)?)

Instances of predunit and kekpredunit linked with ci if there are more than one of them. This is tightly bound metaphorical modification of predicates.

despredA <- ((predunit / kekpredunit) (((freemod)? CI (freemod)? (predunit / kekpredunit)))*)

The result of prepending zero or more (cui despredC CA)'s to an instance of the previous class.

despredB <- ((!(PREDA) CUI (freemod)? despredC (freemod)? CA (freemod)? despredB) / despredA)

A simple chain of one or more despredB's: more loosely bound metaphorical predicate modification. Note that semantically it groups to the left.

despredC <- (despredB (((freemod)? despredB))*)

A sequence of one or more despredB's linked by CA connectives (if there are more than one)

despredD <- (despredB (((freemod)? CA (freemod)? despredB))*)

A simple chain of despredD's: this is again a metaphorical modification of predicates, grouping to the left semantically.

despredE <- (despredD (((freemod)? despredD))*)

A despredE metaphorically modified by a descpred following it, marked with go. This is the kind of predicate which follows a article to form a description.

descpred <- ((despredE (freemod)? GO (freemod)? descpred) / despredE)

The following simple sentence predicate class differs from the simple description predicate class descpred only in that a modifier optionally appended with go is of the looser barepred class which can have arguments attached which are not tightly bound with JE/JUE links.

sentpred <- ((despredE (freemod)? GO (freemod)? barepred) / despredE)


This section describes the general construction of sentence modifiers (relative clauses). The basic forms are PA3 argument (guua) and PA2 (gap), where the latter is not followed by a barepred (so it is not a tense). Recall that PA3 is the class of PA words without pauses between PA0 units, while PA2 allows such pauses. There are reasons for this which can be illustrated with examples.

Further, sentence modifiers can be linked with forethought and afterthought logical and causal connectives in a sensible way.

mod1a <- (PA3 (freemod)? argument1 (guua)?)

mod1 <- ((PA3 (freemod)? argument1 (guua)?) / (PA2 (freemod)? !(barepred) (gap)?))

kekmod <- (((NO1 (freemod)?))* (KA (freemod)? modifier (freemod)? KI (freemod)? mod))

mod <- (mod1 / (((NO1 (freemod)?))* mod1) / kekmod)

modifier <- (mod ((A1 (freemod)? mod))*)

H3>Name resolution and serial names This section deals with the construction of complex names. It contains my solution to the false name marker problem.

This is a place at which it is possible but not certain that a phonetic pause occurs (a space between a vowel and a consonant).

maybebreak <- (V1 (stress)? ' ' !((([ ])* V1)))

This is a documented break in the flow of speech.

realbreak <- (!(maybebreak) letter (stress)? ((([,])? ' ') / period / &(end)))

This is a break in the flow of speech following a consonant.

consonantbreak <- (C1 (stress)? ((([,])? ' ') / period / &(end)))

This is a situation in which there is not an immediately following pause and there are maybebreaks and no realbreaks between the present position and a following consonantbreak.

badspaces <- (!(([,] ' ')) ((!((maybebreak / realbreak)) .))* maybebreak ((!(realbreak) .))* consonantbreak)

These are the name marker words. A name word may appear without a leading pause only after one of these. A name marker is required not be be followed by badspaces: that is, there must be a pause following it immediately or there must be no possible but uncertain pauses between it and a following pause after a consonant: the problem is that if none of these pauses are real, it would be the case that the entire phrase up to that distant consonant break, clearly not intended to be a name word, would be heard as a name word.

NOTE: niu should be added to this list.

namemarker <- ((([ ])* ((L a) / (H OI) / (L OI) / (L OA) / (S IE) / (S IA) / (S IU) / (C i) / (H UE) / (L IU) / (G AO))) !(badspaces))

The class of names which contain no phonetic string looking like a name marker word and followed by a valid name word. Name words not in this class are the name words which contain false name markers.

nonamemarkers <- (([ ])* ((!((namemarker Name)) Letter))+ !(Letter))

This is ci used as a name marker.

CI0 <- ([Cc] i &((([ ])* C1)))

This is the serial name construction. A serial name is a chain of name words and predunits in which all predunits are marked with ci, anything following a predunit is marked with ci, and any name which contains a false name marker is marked with ci.

name <- (DJAN (((CI0 DJAN) / (CI !(badspaces) (comma)? predunit !((&(nonamemarkers) Name))) / (CI (comma)? DJAN) / (&(nonamemarkers) Name)))* (freemod)?)

This is la used as a name marker.

LA0 <- (([Ll] a) !(badspaces))

la without a pause followed by something which can be read as a name word is preferred to be read as such. Notice that for this to happen the name word must be consonant-initial or CANCELPAUSE must be used.

LANAME <- (([ ])* LA0 (CANCELPAUSE / (([ ])* &(C1))) name)

la followed by an explicit pause followed by something which can be read as a name word will be parsed as such only if it is not parsed as a description in a different way.

LANAME2 <- (([ ])* LA0 ((',' ([ ])+) / (([ ])* &(V1))) name)


Here is the construction of vocatives. Note that the use of an unmarked name as a vocative and therefore a free modifier was outlawed; this was a major cause of false name marker issues. The first class is hoi used as a name marker. The second is the vocative class itself. By preference, one first reads hoi or kin followed without pause (or with cancelled pause) by a name word; then one tries to read hoi or kin followed by a descriptive predicate (which possibly has a name appended to it), then one tries to read hoi or kin followed by an argument, then one tries to read hoi or kin followed by a pause followed by a name word, then one tries to read hoi (exactly, not one of its kin) followed by a foreign name, which must be closed by a comma, terminal punctuation, or end of text.

HOI0 <- ((([Hh] OI) / ([Ll] OI) / ([Ll] OA) / ([Ss] IA) / ([Ss] IE) / ([Ss] IU)) !(badspaces))

voc <- ((([ ])* HOI0 (CANCELPAUSE / (([ ])* &(C1))) name) / (HOI !(badspaces) (freemod)? descpred (guea)? (((((comma)? CI (comma)?) / (comma &(nonamemarkers) !(AcronymicName))) name))?) / (HOI !(badspaces) (freemod)? argument1 (guua)?) / (([ ])* HOI0 ((',' ([ ])+) / (([ ])* &(V1))) name) / (H OI stringnospacesclosedblock))


We now present the quite complex analysis of arguments.

First is a basic class of constructions with articles. First we have the constructions of explicit ordered and unordered lists. Then we have basic constructions with the class of LE articles. Each of these begins with a LE article, optionally followed by an argument of class arg1a (playing the role of a possessive), optionally followed by a PA2 tense, optionally followed by a mex quantifier, followed by either a descpred descriptive predicate or an arg1a argument (in this last case the mex component must appear).

There is a novelty here. In 1989 Loglan, lemi hasfa and le la Djan, hasfa were not instances of the same construction. In the first, lemi was parsed as a LE word. lemi is no longer a LE word, and the grammar of these two assertions is the same. By analogy with the 1989 Loglan word lemina, this rule allows arguments like le la Djan, na hasfa, "John's present house".

descriptn <- (!(LANAME) ((LAU wordset1) / (LOU wordset2) / (LE (freemod)? ((((!(mex) arg1a (freemod)?))? ((PA2 (freemod)?))?))? mex (freemod)? descpred) / (LE (freemod)? ((((!(mex) arg1a (freemod)?))? ((PA2 (freemod)?))?))? mex (freemod)? arg1a) / (GE (freemod)? mex (freemod)? descpred) / (LE (freemod)? ((((!(mex) arg1a (freemod)?))? ((PA2 (freemod)?))?))? descpred)))

Abstraction and event descriptions. Multiple allowed openers and closers allow us to close multiple nested abstract descriptions with one closer with proper planning. Note that a LEFORPO PO (sentence) (GUO) does not contain a PO (sentence) (GUO) substructure: these are two independent constructions. Where a LE article is applied to a (PO predicate) modifying another argument, the (PO predicate) should be prefixed with ge to avoid accidental formation of an abstract description. The distinction between abstract predicates and abstract descriptions avoids any need for double closures of abstract descriptions.

abstractn <- ((LEFORPO (freemod)? POA (freemod)? uttAx (guoa)?) / (LEFORPO (freemod)? POA (freemod)? sentence (guoa)?) / (LEFORPO (freemod)? POE (freemod)? uttAx (guoe)?) / (LEFORPO (freemod)? POE (freemod)? sentence (guoe)?) / (LEFORPO (freemod)? POI (freemod)? uttAx (guoi)?) / (LEFORPO (freemod)? POI (freemod)? sentence (guoi)?) / (LEFORPO (freemod)? POO (freemod)? uttAx (guoo)?) / (LEFORPO (freemod)? POO (freemod)? sentence (guoo)?) / (LEFORPO (freemod)? POU (freemod)? uttAx (guou)?) / (LEFORPO (freemod)? POU (freemod)? sentence (guou)?) / (LEFORPO (freemod)? PO (freemod)? uttAx (guo)?) / (LEFORPO (freemod)? PO (freemod)? sentence (guo)?))

A variety of basic argument constructions, including abstract descriptions, all forms with the numerical article lio (note that lio stringnospaces allows reference to Arabic numerals), foreign names, names with la followed by no pause or a cancelled pause (attempted first), followed by descriptn possibly suffixed with names (attempted second), followed by names with la followed by a pause (read only if no initial segment of the name word parses as a descriptive predicate), and further, quoted words or characters, strong quotations, and quoted Loglan text.

Where a descriptn is suffixed with a name, the name may be unmarked (but set off by an initial explicit comma pause) if it is a consonant final name word with no false name markers, and otherwise will be marked with ci. A construction of this kind, such as la blanu, Djan, is not properly speaking a serial name.

arg1 <- (abstractn / (LIO (freemod)? descpred (guea)?) / (LIO (freemod)? argument1 (guua)?) / (LIO (freemod)? mex (gap)?) / (LIO stringnospaces) / LAO / LANAME / (descriptn (guua)? (((((comma)? CI (comma)?) / (comma &(nonamemarkers) !(AcronymicName))) name))?) / LANAME2 / LIU1 / LIE / LI)

Arguments which are atomic in a certain sense: pronouns, members of the previous class, and members of this same class prefixed with ge.

arg1a <- ((DA / TAI / arg1 / (GE (freemod)? arg1a)) (freemod)?)

Subordinate clauses formed with ji or jio or kin. JI class particles form subordinate clauses with predicates, arguments, or modifiers; JIO class particles form subordinate clauses with sentences (of particular kinds). Subordinate clauses can be negated, and they can be linked with afterthought logical connectives. Such afterthought linked chains of subordinate clauses (including a chain of just one clause) can be closed with gui or gu.

argmod1 <- ((((__LWinit (N o) ([ ])*))? ((JI (freemod)? predicate) / (JIO (freemod)? sentence) / (JIO (freemod)? uttAx) / (JI (freemod)? modifier) / (JI (freemod)? argument1))) / (((__LWinit (N o) ([ ])*))? (((JIZA (freemod)? predicate) (guiza)?) / ((JIOZA (freemod)? sentence) (guiza)?) / ((JIOZA (freemod)? uttAx) (guiza)?) / ((JIZA (freemod)? modifier) (guiza)?) / (JIZA (freemod)? argument1 (guiza)?))) / (((__LWinit (N o) ([ ])*))? ((JIZI (freemod)? predicate (guizi)?) / (JIOZI (freemod)? sentence (guizi)?) / (JIOZI (freemod)? uttAx (guizi)?) / (JIZI (freemod)? modifier (guizi)?) / (JIZI (freemod)? argument1 (guizi)?))) / (((__LWinit (N o) ([ ])*))? ((JIZU (freemod)? predicate (guizu)?) / (JIOZU (freemod)? sentence (guizu)?) / (JIOZU (freemod)? uttAx (guizu)?) / (JIZU (freemod)? modifier (guizu)?) / (JIZU (freemod)? argument1 (guizu)?))))

argmod <- (argmod1 ((A1 (freemod)? argmod1))* (gui)?)

This class consists of an arg1a with a subordinate clause (or logically connected subordinate clauses) attached. The next class arg3 attaches a quantifier.

arg2 <- (arg1a (freemod)? (argmod)*)

arg3 <- (arg2 / (mex (freemod)? arg2))

Indefinite arguments: a quantifier followed by a descriptive predicate, then this class with subordinate clause(s) attached.

indef1 <- (mex (freemod)? descpred)

indef2 <- (indef1 (guua)? (argmod)*)

NOTE: fix the stutter in the naming here?

indefinite <- indef2

A string of arguments of class arg3 or indefinite linked with the mixture connective ze.

arg4 <- ((arg3 / indefinite) ((ZE2 (freemod)? (arg3 / indefinite)))*)

Forethought connected arguments, followed by possibly multiply negated and indirectly referenced arguments.

arg5 <- (arg4 / (KA (freemod)? argument1 (freemod)? KI (freemod)? argx))

argx <- (((NO1 (freemod)?))* ((LAE (freemod)?))* arg5)

Arguments of class argx linked with the most tightly binding afterthought logical connectives, the ACI series.

arg7 <- (argx (freemod)? ((ACI (freemod)? argx))?)

Arguments of class arg7 linked with the usual A afterthought logical connectives.

An item of this class cannot begin with ge to avoid phonetic ambiguity with the AGE connectives. This is an example of ambiguity in 1989 Loglan caused by the lexer being invisible to the grammar analysis.

arg8 <- (!(GE) (arg7 (freemod)? ((A1 (freemod)? arg7))*))

Arguments linked with the least binding and right grouping AGE afterthought logical connectives; further, optionally subordinate clauses may be attached to an instance of this class, marked with guu.

argument1 <- (((arg8 (freemod)? AGE (freemod)? argument) / arg8) ((GUU (freemod)? argmod))*)

Possibly multiply negated, possibly multiply case tagged arguments.

I drew a distinction between contexts where case tags are appropriate and contexts where they are not which I believe is not drawn by the LIP grammar.

argument <- (((NO1 (freemod)?))* ((DIO (freemod)?))* argument1)

These are intended to represent arguments in the first, second, third and fourth argument position after the predicate; they are all the same class internally, but the distinct class names give useful annotations to the parses.

argumentA <- argument

argumentB <- argument

argumentC <- argument

argumentD <- argument

This is an argument actually decorated with at least one case tag. NOTE: the argument probably should not be allowed to begin with NO1 here. argxx <- (&((((NO1 (freemod)?))* DIO)) argument)

Term lists (including word lists for set forms)

A term attached to predicate is either an argument or a modifier. term <- (argument / modifier)

A term list made up entirely of modifiers.

modifiers <- (modifier (((freemod)? modifier))*)

A term list made up entirely of modifiers and arguments with explicit case tags.

modifiersx <- ((modifier / argxx) (((freemod)? (modifier / argxx)))*)

An argument which can be read as starting an SVO sentence. firstarg <- (argument1 ((modifiersx (freemod)?))? ((GIO (freemod)? terms))? predicate !(GA2))

A term list with untagged arguments typed with the lettered argument classes (and so with no more than four untagged arguments). In addition, the second and subsequent untagged arguments appearing as terms cannot start an SVO sentence.

terms <- (((modifiersx)? argumentA (((freemod)? modifiersx))? ((!(firstarg) argumentB))? (((freemod)? modifiersx))? ((!(firstarg) argumentC))? (((freemod)? modifiersx))? ((!(firstarg) argumentD))?) / modifiersx)

In the "official" parser, firstarg does not appear and !firstarg annotations in terms should be ignored.

The internals of explicit ordered and unordered lists. These are completely redesigned.

word <- (arg1a / indef2)

words1 <- (word ((ZEIA word))*)

words2 <- (word ((ZEIO word))*)

wordset1 <- ((words1)? LUA)

wordset2 <- ((words2)? LUO)

Lists of terms following the predicate: these can be connected with forethought and afterthought connectives and may be optionally closed with guu. None of the terms may start an SVO sentence (but see note below on removing this condition). There is the entertaining option of closing a termset with (go barepred), where the barepred metaphorically modifies the predicate before the termset.

To obtain the "official" parser, omit the !firstarg annotation.

Closure of termsets with guu is handled mostly in contexts in which termsets appear, not in the termset class itself.

termset1 <- (((modifiersx)* ((!(firstarg) terms) / (KA (freemod)? termset2 (freemod)? (guu)? KI (freemod)? termset1))) / (modifiersx)+)

termset2 <- (termset1 ((guu &(A1)))? ((A1 (freemod)? termset1 ((guu &(A1)))?))*)

termset <- ((terms (freemod)? GO (freemod)? barepred) / termset2)

More complicated predicates

This is a basic sentence predicate with a termset attached.

barepred <- (sentpred (freemod)? (((termset (guu)?) / (guu &(termset))))?)

This is a tensed predicate. PA1 is the form allowing internal pauses and also the option of ga.

markpred <- (PA1 (freemod)? barepred)

This is a possibly repeatedly negated barepred or markpred. Systematic distinctions drawn between classes of marked and unmarked predicates in the LIP grammar were demonstrably already unnecessary in that grammar.

backpred1 <- (((NO2 (freemod)?))* (barepred / markpred))

The scheme for afterthought linkage of sentence predicates with logical connectives and attachment of shared final termsets to such linked predicates in the final version of the LIP grammar is very elegant and left grouping in a way which cannot be implemented with a PEG. Also, the ACI connectives are simply weird in the LIP grammar. Here we treat the ACI connectives as a fully privileged more tightly binding series of connectives (so that backpred is actually isomorphic to predicate2 in some sense) and allow attachment of final termsets in a way which is not as generous as the original arrangement but which is likely to cover all possibilities which will occur in speech.

A backpred is a chain of backpred1's linked by ACI connectives (if there is more than one) possibly with a shared final termset, in turn optionally linked to a further chain of backpreds possibly with a shared final termset.

backpred <- (((backpred1 ((ACI (freemod)? backpred1))+ (freemod)? (((termset (guu)?) / (guu &(termset))))?) ((((ACI (freemod)? backpred))+ (freemod)? (((termset (guu)?) / (guu &(termset))))?))?) / backpred1)

A predicate2 is built from backpreds using A series connectives in the same double way that backpreds are built from backpred1s. It should be noted that the ACI and A series connectives in all contexts are taken to group to the left, in semantic terms.

A predicate2 may not start with ge because of a possible ambiguity with the AGE connectives. For this reason, one generally cannot start a sentence predicate with ge (pragmatically, there is no reason to do so).

predicate2 <- (!(GE) (((backpred ((A1 !(GE) (freemod)? backpred))+ (freemod)? (((termset (guu)?) / (guu &(termset))))?) ((((A1 (freemod)? predicate2))+ (freemod)? (((termset (guu)?) / (guu &(termset))))?))?) / backpred))

A predicate1 is a predicate2 or built by linking predicate2's with the most loosely binding AGE series of logical connectives. This groups to the right, and no shared termset is provided.

predicate1 <- ((predicate2 AGE (freemod)? predicate1) / predicate2)

The top level sentence predicate class consists of predicate1's and possibly multiply negated BI series logical predicates.

identpred <- (((NO1 (freemod)?))* (BI (freemod)? argument1 (guu)?))

predicate <- (predicate1 / identpred)

Sentence forms

I overhauled the various sentence structures by actually enforcing statements in Notebook 3 that certain terms or termset structures in various grammar rules should consist entirely of modifiers or should contain no more than one argument. I discovered in parsing Leith that allowing second and further arguments to appear before the predicate is a fruitful source of unintended parses of actually ill-formed utterances: orphaned final arguments of one sentence become unintended first arguments of the next; but clearly we need to allow something like this to have SOV word order. I solved this by separating second and further untagged arguments before a predicate from the subject with the new particle gio. Further, I stipulated that imperatives cannot be tensed (tensed sentences without subjects are read as gasents with the subject omitted and are in effect observatives) and also stipulated that modifiers before an untensed predicate also give an imperative.

A subject is a term list containing at least one argument and no more than one untagged argument. NOTE: this should probably appear above in the term lists.

subject <- (((modifiers (freemod)?))? ((argxx subject) / (argument ((modifiersx (freemod)?))?)))

A gasent is a sentence with the subject delayed to the end: it consists of a tense followed by a barepred (which may include a final termset) followed by ga followed by terms. The final term list will either be a subject or contain all the arguments of the predicate (with the first one optionally separated from the others by gio)

gasent1 <- (((NO1 (freemod)?))* (PA1 (freemod)? barepred ((GA2 (freemod)? subject))?))

gasent2 <- (((NO1 (freemod)?))* (PA1 (freemod)? sentpred (modifiers)? (GA2 (freemod)? subject (freemod)? (GIO)? (freemod)? (terms)?)))

gasent <- (gasent2 / gasent1)

A statement is either a gasent or a gasent with fronted modifiers, or a subject, optionally followed by gio and further terms, followed by a predicate.

Fronted modifiers before a gasent are not allowed by the rule given here: we follow it by the rule from the "official" parser. We are inclined not to allow the form with fronted modifiers, because it is often what is obtained when a sentence is parsed in an unintended way by the "official" parser. The modifier can be fronted anyway as an instance of headterms, marked with gi.

statement <- (gasent / (subject (freemod)? ((GIO (freemod)? terms))? predicate))

statement <- (gasent / (modifiers (freemod)? gasent) / (subject (freemod)? ((GIO (freemod)? terms))? predicate))

A keksent is a sentence or uttAx (see below: this class is inlined rather than named here) forethought connected to an uttA1, a very general class of sentences and sentence fragments. Odd and interesting things can be said using the uttA1 option.

keksent <- (((NO1 (freemod)?))* ((KA (freemod)? sentence (freemod)? KI (freemod)? uttA1) / (KA sentence (freemod)? KI (freemod)? uttA1) / (KA (freemod)? headterms (freemod)? sentence (freemod)? KI (freemod)? uttA1)))

This is a sentence-initial negation operator with sentence long scope.

neghead <- ((NO1 (freemod)? gap) / (NO2 PAUSE))

A sen1 is a first approximation to the sentence class. The first and third options describe imperatives, in which no argument appears before the untensed predicate. The other two are familiar sentence classes. The sen1 can further be negated one or more times (that is what neghead is for).

The first rule does not allow modifiers initial to imperatives (gi can be used to attach them if desired). The second rule is the rule in the "official" parser which does allow fronted modifiers in this class.

sen1 <- (((neghead (freemod)?))* (statement / predicate / keksent))

sen1 <- (neghead freemod?)* ((modifiers (freemod)? !(gasent) predicate) / statement / predicate / keksent)

A sentence is one or more sen1's linked by ICA logical connectives (if there is more than one of them).

sentence <- (sen1 ((ICA (freemod)? sen1))*)

An uttAx is a sentence (possibly logically connected!) with a shared fronted sequence of arguments (actually to be taken as final arguments).

headterms <- ((terms GI))+

uttAx <- (headterms (freemod)? sentence (giuo)?)

Inverse vocatives and free modifiers

The inverse vocative construction. This is similar to the vocative, though with the further option of a full statement being tagged with hue. I allow an argument used as an inverse vocative to be closed with guu; this averted many painful phenomena in the Leith parse, where frequently an inverse vocative intended to capture an argument actually captured an entire sentence.

HUE0 <- ([Hh] UE)

invvoc <- ((([ ])* HUE0 (CANCELPAUSE / (([ ])* &(C1))) name) / (HUE !(badspaces) (freemod)? descpred (guea)? (((((comma)? CI (comma)?) / (comma &(nonamemarkers) !(AcronymicName))) name))?) / (HUE !(badspaces) (freemod)? statement (giuo)?) / (HUE !(badspaces) (freemod)? argument1 (guu)?) / (([ ])* HUE0 ((',' ([ ])+) / (([ ])* &(V1))) name) / (HUE stringnospacesclosedblock))

The lengthy list of free modifier constructions. Note that optional free modifiers appear in most medial positions in rules and some final ones (final ones on rules viewed as "atomic", usually). A very important free modifier is an explicit comma pause [not followed by a name word without false name markers], which can thus be inserted in many places (including places required to separate vowel initial words from what precedes them). What the various constructions are can be divined from the list above of their constructors.

freemod <- ((NOUI / (SOI (freemod)? descpred (guea)?) / DIE / (NO1 DIE) / (KIE (comma)? utterance0 (comma)? KIU) / invvoc / voc / CANCELPAUSE / (comma !((&(nonamemarkers) Name))) / JO / UI1 / (([ ])* '...' ((([ ])* &(letter)))?) / (([ ])* '--' ((([ ])* &(letter)))?)) (freemod)?)


This class contains two kinds of fragmentary utterances, usually answers to questions.

uttA <- ((A1 / mex) (freemod)?)

This class contains sen1 and uttAx sentences and a menagerie of sentence fragments which may appear as answers to questions. Note that this may end with terminal punctuation.

uttA1 <- ((sen1 / uttAx / links / linkargs / argmod / (modifiers (freemod)? keksent) / terms / uttA / NO1) (freemod)? (period)?)

An uttC is a possibly multiply negated uttA1. The negations are separated from the utterance negated by a gap or pause. The PAUSE here may be semantically significant (with latest mods, it won't change the meaning, but it will change the parse tree)!

uttC <- ((neghead (freemod)? uttC) / uttA1)

An uttD is a sentence possibly with terminal punctuation not followed by ICI or ICA, or a sequence of uttC's linked by ICI (if there is more than one uttC).

uttD <- ((sentence (period)? !(ICI) !(ICA)) / (uttC ((ICI (freemod)? uttD))*))

An uttE is a sequence of uttD's linked by ICA connectives. A sentence (sen1's linked by ICA connectives) will be parsed as a single uttD, however, due to the careful definition of uttD.

uttE <- (uttD ((ICA (freemod)? uttD))*)

An uttF is a sequence of uttE's linked by I utterance connectives. These are supposed grouped to the left.

uttF <- (uttE ((I (freemod)? uttE))*)

A top level utterance can be constructed in various weird and wonderful ways. The first class is designed to be quoted or parenthesized; the second one is constrained to end with end of text. A free modifier followed by an utterance is an utterance (the only leading appearance of freemods); a free modifier is an utterance by itself; uttF's can be linked with IGE to utterances; an uttF is an utterance; ICA followed by an uttF is an utterance; I followed by an utterance is an utterance. NOTE: some of these cases need analysis, and there might possibly be order bugs in this rule.

utterance0 <- (!(GE) ((!(PAUSE) freemod (period)? utterance0) / (!(PAUSE) freemod (period)?) / (uttF IGE utterance0) / uttF / (I (freemod)? (uttF)?) / (I (freemod)? (period)?) / (ICA (freemod)? uttF)) ((&(I) utterance0))?)

utterance <- (!(GE) ((!(PAUSE) freemod (period)? utterance) / (!(PAUSE) freemod (period)? ((&(I) utterance))? end) / (uttF IGE utterance) / (I (freemod)? (period)? ((&(I) utterance))? end) / (uttF ((&(I) utterance))? end) / (I (freemod)? uttF ((&(I) utterance))? end) / (ICA (freemod)? uttF ((&(I) utterance))? end)))