!!!Inari Sámi morphological analyser


 !!!Multichar_Symbols definitions

!!Parts of speech
 *  +N +A +Adv +V					  
 *  +Pron +CS +CC					  
 *  +Adp +Po +Pr 					  
 *  +Interj +Pcle					  
 *  +Num +ABBR +ACR +Coll +Arab  +Rom        

!!Subclasses
 *  +Pers +Dem +Interr +Indef 		  
 *  +Refl +Recipr +Rel	+Ord +NomAg			  

!!Grammatical properties


!Person - number
 *  +Sg +Pl +Du						  

 *  +Sg1 +Sg2 +Sg3 				  
 *  +Du1 +Du2 +Du3 				  
 *  +Pl1 +Pl2 +Pl3					  

 *  +PxSg1 +PxSg2 +PxSg3 			  
 *  +PxDu1 +PxDu2 +PxDu3 			  
 *  +PxPl1 +PxPl2 +PxPl3 			  

!Case
 *  +Nom +Gen +Acc 				  
 *  +Ill +Ine +Ela 				  
 *  +Com +Ess +Par +Abe 			  
 *  +Loc                			  


 *  +Known  mon , till we found a better tag

!Adjectival forms
 *  +Comp +Superl 			  
 *  +Attr				  

!Adverb types 

 *  +Spat	   	Spatial adverbs
 *  +Temp	    Temporal adverbs


!Tense - mood
 *  +Ind +Pot +Cond +Imprt +ImprtII  
 *  +Prs +Prt						  

!Indefinite verb forms
 *  +Pass +Sup						  
 *  +Inf +Ger +GerII 				  
 *  +ConNeg +Neg 					  
 *  +PrsPrc +PrfPrc 				  
 *  +VGen +VAbess					  
 *  +Actio					  

{{{

}}}


All non-positional derivations should be preceded by this tag, to make it possible
to target regular expressions at all derivations in a language-independent way:
just specify +Der|+Der1 .. +Der5 and you are set.

 * +Der


!Derivations
!Other/unclassified derivations, can appear in all positions:


 *  +Der/ag  neeljičievâg neeljijienâg kuulmâloonjâg neeljičievâg neeljijienâg
 *  +Der/ahasas                     85-ahasâš škovlâahasâš
 *  +Der/ivvaas                  
 *  +Der/vualasas                     tutkâmvuálásâš

!Clitics
 *  +Foc       				  
 *  +Foc/gan        
 *  +Foc/gas        
 *  +Foc/ges        
 *  +Foc/gis        
 *  +Foc/gin        
 *  +Foc/han        
 *  +Foc/kin        
 *  +Foc/ba         
 *  +Foc/pa         
 *  +Foc/sun        
 *  +Foc/kis        
 *  +Foc/ban        
 *  +Foc/baa        
 *  +Foc/baan       
 *  +Foc/ge         
 *  +Foc/go         
 *  +Foc/kas        
 *  +Foc/nii        
 *  +Foc/uv        

!Usage tags

 * __ +Err/Orth        __ substandard, not in normative fst
 * __ +Err/Lex         __ substandard, not in normative fst, no normative lemma
 * __ +MWE             __ - MultiWord Expression, used for abbreviation extraction for preprocess.sh
 * __ +Use/-PLX        __ - do not include in Polderland spellers (most likely irrelevant for smn)
 * __ +Use/-Spell      __ - do not include in speller (even though the entry is formally correct)
 * __ +Use/SpellNoSugg __ - Recognized, but not suggested in speller 

!!Semantic properties of names
 *  +Prop +Sem/Ani +Sem/Atr	               
 *  +Sem/Mal +Sem/Fem +Sem/Sur                
 *  +Sem/Plc +Sem/Org +Sem/Obj +Sem/Obj-el                
 *  +Sem/Measr +Sem/Money +Sem/Veh +Sem/Year  
!!Punctuation

 *  +CLB +PUNCT +HYPH         
 *  +PAR +LEFT +RIGHT         

!!Morphophonemes

 * ^P ^K ^Č ^H ^T         for pp:v etc. gradation

 *  k4 l4 t4 p4 c4 t4 č4    = these are consonants that change in cg
 *  '7  
 *  i4   i6              = this is the postvocalic i consonant, realised as i
 *  i6  j6             = these are fake vowel and consonant, to get rules to function for exeptions
 *  i5				 = comitative suffix-begin in loanwords
 *  a5 ä5 á5 u5 o5    these vowels do not change
 *  h5 j5 m5 ŋ5 t5 c5 d5 l5 t5 r5 č5 k5    these consonants do not change in WG
 *  y5                    these vowels do not change, e.g. pyerá
 *  i2  u2 i3 â2       stemvowel changing to e, e.g. kyeli:kyeˊle 
 *  ⎈    used for dynamic compounds, U+1F631

!Archiphonemes

 * ^RC     Root consonant dummy
 * ^RV     Root vowel dummy
 * ^SC     Suffix consonant dummy
 * ^SV     Suffix vowel dummy
 * ^V     = vowel copy

!Triggers

 * ^CLEN   Consonant lengthening in qual WG
 * ^CSH    Consonant shortening (not WG)
 * ^FCD    Final consonant deletion
 * ^EA     is á and root vowel change in Ill Sg of i-stems
 * ^RLEN   Root vowel lengthening (impl. WG)
 * ^RVSH   Root vow shortening
 * ^SLEN   Suffix vowel lengthening 
 * ^SVLOW   Suffix vowel lowering â > á and u > o
 * ^SVSH   Second syllable vowel shortening
 * ^VLOW    is Vowel lowering in 3rd sg of contract verbs tuhhid:tohhe
 * ^WG     Weak grade trigger
 * ^ÁE      á->e
 * ^ÁI      á->i
 * ^VHIGH  = hightening of vowels for verbs o to uu, a to oo
 * ^VBACK    = back vowels for verbs, ä to a (when needed, normally 2syll a|â is enough
 * ^BLOCK    = This symbol just to block otherwise triggering contexts

!!Symbols that need to be escaped on the lower side (towards twolc):


!!Variants


!!Semantic tags

 * +Sem/Body  denotes bodyparts
 * +Sem/Plc  denotes places

!!Compound tags
 * +Cmp  compounds
 * +Cmp/Hyph  compounds

 * +Cmp/SgNom  compounds
 * +Cmp/PlNom  compounds
 * +Cmp/Attr  compounds
 * +Cmp/SgGen  compounds
 * +Cmp/PlGen  compounds
 * +Cmp/SplitR  compounds
 * +Cmp/Sh  compounds

 * __+CmpNP/All__ - ... in all positions, __default__, this tag does not have to be written
 * __+CmpNP/First__ - ... only be first part in a compound or alone
 * __+CmpNP/Pref__ - ... only __first__ part in a compound, NEVER alone
 * __+CmpNP/Last__ - ... only be last part in a compound or alone
 * __+CmpNP/Suff__ - ... only __last__ part in a compound, NEVER alone
 * __+CmpNP/None__ - ... does not take part in compounds
 * __+CmpNP/Only__ - ... only be part of a compound, i.e. can never
                    be used alone, but can appear in any position
The tagged part of the compound should make a compound using:

 * __+CmpN/SgN__ Singular Nominative
 * __+CmpN/SgG__ Singular Genitive
 * __+CmpN/PlG__ Plural Genitive

Unmarked = Default, ie {{+CmpN/SgN}} for SMN.

The second part of the compound may require that the previous (left part) is:

 * __+CmpN/SgNomLeft__ Singular Nominative
 * __+CmpN/SgGenLeft__ Singular Genitive
 * __+CmpN/PlGenLeft__ Plural Genitive


!!Language tagged names

 * +OLang/ENG  		  
 * +OLang/FIN  		  
 * +OLang/NNO  		  
 * +OLang/NOB  		  
 * +OLang/SME  		  
 * +OLang/SMA  		  
 * +OLang/SWE  		  
 * +OLang/UND  		  
 * +OLang/RUS  		  


!!Flag diacritics
We have manually optimised the structure of our lexicon using following
flag diacritics to restrict morhpological combinatorics - only allow compounds
with verbs if the verb is further derived into a noun again:
 | @P.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised
 | @D.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised
 | @C.NeedNoun@ | (Dis)allow compounds with verbs unless nominalised
 | @R.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised

For languages that allow compounding, the following flag diacritics are needed
to control position-based compounding restrictions for nominals. Their use is
handled automatically if combined with +CmpN/xxx tags. If not used, they will
do no harm.
 | @P.CmpFrst.FALSE@ | Require that words tagged as such only appear first
 | @D.CmpPref.TRUE@ | Block such words from entering ENDLEX
 | @P.CmpPref.FALSE@ | Block these words from making further compounds
 | @D.CmpLast.TRUE@ | Block such words from entering R
 | @D.CmpNone.TRUE@ | Combines with the next tag to prohibit compounding
 | @U.CmpNone.FALSE@ | Combines with the prev tag to prohibit compounding
 | @U.CmpNone.TRUE@ | Combines with the two previous ones to block compounding
 | @P.CmpOnly.TRUE@ | Sets a flag to indicate that the word has passed R
 | @D.CmpOnly.FALSE@ | Disallow words coming directly from root.
 | @D.CmpHyph.TRUE@ | Flag to control hyphenated compounds like proper nouns
 | @U.CmpHyph.FALSE@ | Flag to control hyphenated compounds like proper nouns
 | @U.CmpHyph.TRUE@ | Flag to control hyphenated compounds like proper nouns
 | @C.CmpHyph@ | Flag to control hyphenated compounds like proper nouns
 | @P.CmpHyph.TRUE@ | Flag to control hyphenated compounds like proper nouns
 | @N.CmpHyph.TRUE@ | Flag to control hyphenated compounds like proper nouns


Use the following flag diacritics to control downcasing of derived proper
nouns (e.g. Finnish Pariisi -> pariisilainen). See e.g. North Sámi for how to use
these flags. There exists a ready-made regex that will do the actual down-casing
given the proper use of these flags.
 | @U.Cap.Obl@ | Allowing downcasing of derived names: deatnulasj.
 | @U.Cap.Opt@ | Allowing downcasing of derived names: deatnulasj.


 * @U.NeedsVowRed.OFF@ is used to force hyphenation/non-reduction: samediggi-
 * @U.NeedsVowRed.ON@ is used to force reduction w/o hyphen: samedigge#xxx
 * @C.NeedsVowRed@ Clearing this feature, so that it doesn't interfere with further compounding

 * @P.Px.add@	
 * @R.Px.add@	
 * @P.Px.block@
 * @D.Px.block@

 * @R.SpellRlx.ON@ Flag used to tag spell-relax-analysed strings (and only those).
 * @D.SpellRlx.ON@ Flag used to tag spell-relax-analysed strings (and only those).
 * @C.SpellRlx@ Flag used to tag spell-relax-analysed strings (and only those).

 * @R.SpaceCmp.ON@ Flag to tag compounds written with a space
 * @D.SpaceCmp.ON@ Flag to tag compounds written with a space
 * @C.SpaceCmp@ Flag to tag compounds written with a space


!!!Basic lexica, pointing to the other lexicon files

 LEXICON Root   

 * __LEXICON ProperNoun   __ 


!!!Lexicon ENDLEX
And this is the ENDLEX of everything:
{{{
 @D.CmpOnly.FALSE@@D.CmpPref.TRUE@@D.NeedNoun.ON@ # ;
}}}
The {{@D.CmpOnly.FALSE@}} flag diacritic is ued to disallow words tagged
with +CmpNP/Only to end here.
The {{@D.NeedNoun.ON@}} flag diacritic is used to block illegal compounds.