! Divvun & Giellatekno - open source grammars for Sámi and other languages
! Copyright © 2000-2010 The University of Tromsø & the Norwegian Sámi Parliament
! http://giellatekno.uit.no & http://divvun.no
!
! This program is free software; you can redistribute and/or modify
! this file under the terms of the GNU General Public License as published by
! the Free Software Foundation, either version 3 of the License, or
! (at your option) any later version. The GNU General Public License
! is found at http://www.gnu.org/licenses/gpl.html. It is
! also available in the file $GTHOME/LICENSE.txt.
!
! Other licensing options are available upon request, please contact
! giellatekno@hum.uit.no or feedback@divvun.no

! ========================================================================== !
!!               !!!Inari Sámi morphological analyser
! ========================================================================== !


Multichar_Symbols  !!≈ !!!@CODE@ definitions

!! !!Parts of speech
 +N +A +Adv +V					  !!= * @CODE@
 +Pron +CS +CC					  !!= * @CODE@
 +Adp +Po +Pr 					  !!= * @CODE@
 +Interj +Pcle					  !!= * @CODE@
 +Num +ABBR +ACR +Coll +Arab  +Rom        !!= * @CODE@
 
!! !!Subclasses
 +Pers +Dem +Interr +Indef 		  !!= * @CODE@
 +Refl +Recipr +Rel	+Ord +NomAg			  !!= * @CODE@

!! !!Grammatical properties

 +IV +TV

!! !Person - number
 +Sg +Pl +Du						  !!= * @CODE@

 +Sg1 +Sg2 +Sg3 				  !!= * @CODE@
 +Du1 +Du2 +Du3 				  !!= * @CODE@
 +Pl1 +Pl2 +Pl3					  !!= * @CODE@

 +PxSg1 +PxSg2 +PxSg3 			  !!= * @CODE@
 +PxDu1 +PxDu2 +PxDu3 			  !!= * @CODE@
 +PxPl1 +PxPl2 +PxPl3 			  !!= * @CODE@

!! !Case
 +Nom +Gen +Acc 				  !!= * @CODE@
 +Ill +Ine +Ela 				  !!= * @CODE@
 +Com +Ess +Par +Abe 			  !!= * @CODE@
 +Loc                			  !!= * @CODE@


 +Known !!= * @CODE@ mon , till we found a better tag

!! !Adjectival forms
 +Comp +Superl 			  !!= * @CODE@
 +Attr				  !!= * @CODE@

!! !Adverb types 
 
 +Spat	   !!= * @CODE@	Spatial adverbs
 +Temp	   !!= * @CODE@ Temporal adverbs


!! !Tense - mood
 +Ind +Pot +Cond +Imprt +ImprtII  !!= * @CODE@
 +Prs +Prt						  !!= * @CODE@
 +Opt

!! !Indefinite verb forms
 +Pass +Sup						  !!= * @CODE@
 +Inf +Ger +GerII 				  !!= * @CODE@
 +ConNeg +Neg 					  !!= * @CODE@
 +PrsPrc +PrfPrc 				  !!= * @CODE@
 +VGen +VAbess					  !!= * @CODE@
 +Actio					  !!= * @CODE@

! Der#begin
!! {{{
! Derivation position in a derivation row:  Affix and
! 1              2            3            4             POS type
+Der1          +Der2        +Der3		+Der4

! Der#1
+Der/t                                                   ! NA (XXX check and remove)
+Der/Dimin                                               ! NN (was: Der/aš & Der/š)
+Der/lasj                                                ! NA
+Der/d                                                   ! VV
+Der/tt                                                   ! VV - Causative čälittiđ
+Der/Caus                                                ! VV - 3-syll causatives
+Der/l                                                   ! VV
+Der/st                                                  ! VV čälistiđ
+Der/Car                                                 ! NA * +Der1+Der2 - can only combine with Der3 caritive: peljittem
+Der/laakan                                               ! AA * +Der1+Der2 - can only combine with Der3
+Der/Pass                                               ! VV - short passive
! Der#2
              +Der/NomAg
              +Der/NomAct                                ! VN Der/NomAct har to realisasjonar, med ulike restriksjonar,
			  +Der/sasj                                                 ! NA
              +Der/alla                                  ! VV
              +Der/AAdv                                    ! adverb pyeremusávt pyeremusâht
              +Der/taa                                    ! adverb pyeremustáá !This is not the best tag?
! Der#3                    
                           +Der/Pass                    ! VV - long passive
                           +Der/vuota                   ! AN
! Der#4                                                            
                                        +Der/InchL      ! VV
!                                       +Der/NomAct      ! VN Der/NomAct har to realisasjonar, med ulike restriksjonar,
                                                         !    this is previous Der/n. This realisation is Der4.
                                                         !    Outcommented to not define the tag twice, but kept
                                                         !    here for documentation purposes.
                                        +Der/upmi        ! VN
                                        +Der/mas         ! VN
!! }}}
! Der#end


!! All non-positional derivations should be preceded by this tag, to make it possible
!! to target regular expressions at all derivations in a language-independent way:
!! just specify +Der|+Der1 .. +Der5 and you are set.

+Der  !!≈ * @CODE@

 
!! !Derivations
!! !Other/unclassified derivations, can appear in all positions:


 +Der/ag !!= * @CODE@ neeljičievâg neeljijienâg kuulmâloonjâg neeljičievâg neeljijienâg
 +Der/ahasas                    !!= * @CODE@ 85-ahasâš škovlâahasâš
 +Der/ivvaas                  !!= * @CODE@
 +Der/vualasas                    !!= * @CODE@ tutkâmvuálásâš

!! !Clitics
 +Qst 
 +Foc       				  !!= * @CODE@
 +Foc/gan       !!= * @CODE@ 
 +Foc/gas       !!= * @CODE@ 
 +Foc/ges       !!= * @CODE@ 
 +Foc/gis       !!= * @CODE@ 
 +Foc/gin       !!= * @CODE@ 
 +Foc/han       !!= * @CODE@ 
 +Foc/kin       !!= * @CODE@ 
 +Foc/ba        !!= * @CODE@ 
 +Foc/pa        !!= * @CODE@ 
 +Foc/sun       !!= * @CODE@ 
 +Foc/kis       !!= * @CODE@ 
 +Foc/ban       !!= * @CODE@ 
 +Foc/baa       !!= * @CODE@ 
 +Foc/baan      !!= * @CODE@ 
 +Foc/ge        !!= * @CODE@ 
 +Foc/go        !!= * @CODE@ 
 +Foc/kas       !!= * @CODE@ 
 +Foc/nii       !!= * @CODE@ 
 +Foc/uv       !!= * @CODE@ 

!! !Usage tags

 +Err/Orth        !!= * __@CODE@__ substandard, not in normative fst
 +Err/Lex         !!= * __@CODE@__ substandard, not in normative fst, no normative lemma
 +MWE             !!= * __@CODE@__ - MultiWord Expression, used for abbreviation extraction for preprocess.sh
 +Use/-PLX        !!= * __@CODE@__ - do not include in Polderland spellers (most likely irrelevant for smn)
 +Use/-Spell      !!= * __@CODE@__ - do not include in speller (even though the entry is formally correct)
 +Use/SpellNoSugg !!= * __@CODE@__ - Recognized, but not suggested in speller 

!! !!Semantic properties of names
 +Prop +Sem/Ani +Sem/Atr	               !!= * @CODE@
 +Sem/Mal +Sem/Fem +Sem/Sur                !!= * @CODE@
 +Sem/Plc +Sem/Org +Sem/Obj +Sem/Obj-el     !!= * @CODE@           
 +Sem/Measr +Sem/Money +Sem/Veh +Sem/Year  !!= * @CODE@
!! !!Punctuation

 +CLB +PUNCT +HYPH         !!= * @CODE@
 +PAR +LEFT +RIGHT         !!= * @CODE@

!! !!Morphophonemes

^P ^K ^Č ^H ^T        !!= * @CODE@ for pp:v etc. gradation

! m7 n7 ŋ7 v7 s7 š7 r7 đ7 j7 l7 h7 '7   these are the dotted ones
 k4 l4 t4 p4 c4 t4 č4  !!= * @CODE@  = these are consonants that change in cg
 '7 !!= * @CODE@ 
 i4   i6             !!= * @CODE@ = this is the postvocalic i consonant, realised as i
 i6  j6            !!= * @CODE@ = these are fake vowel and consonant, to get rules to function for exeptions
 i5				!!= * @CODE@ = comitative suffix-begin in loanwords
 a5 ä5 á5 u5 o5  !!= * @CODE@  these vowels do not change
 h5 j5 m5 ŋ5 t5 c5 d5 l5 t5 r5 č5 k5  !!= * @CODE@  these consonants do not change in WG
 y5                  !!= * @CODE@  these vowels do not change, e.g. pyerá
 i2  u2 i3 â2     !!= * @CODE@  stemvowel changing to e, e.g. kyeli:kyeˊle 
 ⎈   !!= * @CODE@ used for dynamic compounds, U+1F631
 
!! !Archiphonemes

^RC   !!= * @CODE@  Root consonant dummy
^RV   !!= * @CODE@  Root vowel dummy
^SC   !!= * @CODE@  Suffix consonant dummy
^SV   !!= * @CODE@  Suffix vowel dummy
^V    !!= * @CODE@ = vowel copy

!! !Triggers

^CLEN !!= * @CODE@  Consonant lengthening in qual WG
^CSH  !!= * @CODE@  Consonant shortening (not WG)
^FCD  !!= * @CODE@  Final consonant deletion
^EA    !!= * @CODE@ is á and root vowel change in Ill Sg of i-stems
^RLEN !!= * @CODE@  Root vowel lengthening (impl. WG)
^RVSH !!= * @CODE@  Root vow shortening
^SLEN !!= * @CODE@  Suffix vowel lengthening 
^SVLOW !!= * @CODE@  Suffix vowel lowering â > á and u > o
^SVSH !!= * @CODE@  Second syllable vowel shortening
^VLOW   !!= * @CODE@ is Vowel lowering in 3rd sg of contract verbs tuhhid:tohhe
^WG   !!= * @CODE@  Weak grade trigger
^ÁE     !!= * @CODE@ á->e
^ÁI     !!= * @CODE@ á->i
^VHIGH !!= * @CODE@ = hightening of vowels for verbs o to uu, a to oo
^VBACK   !!= * @CODE@ = back vowels for verbs, ä to a (when needed, normally 2syll a|â is enough
^BLOCK   !!= * @CODE@ = This symbol just to block otherwise triggering contexts

!! !!Symbols that need to be escaped on the lower side (towards twolc):
 »7     ! »
 «7     ! «
 %[%>%] ! >
 %[%<%] ! <

 +Use/NG    ! not-generate, for ped generation isme-ped.fst
 +Use/MT    ! generate only for MT 
 +Use/Circ
 
!! !!Variants
+v1
+v2
+v3
+v4
+Hom1
+Hom2


!! !!Semantic tags

+Sem/Body !!= * @CODE@ denotes bodyparts
+Sem/Plc !!= * @CODE@ denotes places

!! !!Compound tags
+Cmp !!= * @CODE@ compounds
+Cmp/Hyph !!= * @CODE@ compounds

+Cmp/SgNom !!= * @CODE@ compounds
+Cmp/PlNom !!= * @CODE@ compounds
+Cmp/Attr !!= * @CODE@ compounds
+Cmp/SgGen !!= * @CODE@ compounds
+Cmp/PlGen !!= * @CODE@ compounds
+Cmp/SplitR !!= * @CODE@ compounds
+Cmp/Sh !!= * @CODE@ compounds

+CmpNP/All       !!≈ * __@CODE@__ - ... in all positions, __default__, this tag does not have to be written
+CmpNP/First     !!≈ * __@CODE@__ - ... only be first part in a compound or alone
+CmpNP/Pref      !!≈ * __@CODE@__ - ... only __first__ part in a compound, NEVER alone
+CmpNP/Last      !!≈ * __@CODE@__ - ... only be last part in a compound or alone
+CmpNP/Suff      !!≈ * __@CODE@__ - ... only __last__ part in a compound, NEVER alone
+CmpNP/None      !!≈ * __@CODE@__ - ... does not take part in compounds
+CmpNP/Only      !!≈ * __@CODE@__ - ... only be part of a compound, i.e. can never
                 !!                     be used alone, but can appear in any position
!! The tagged part of the compound should make a compound using:

+CmpN/SgN      !!≈ * __@CODE@__ Singular Nominative
+CmpN/SgG      !!≈ * __@CODE@__ Singular Genitive
+CmpN/PlG      !!≈ * __@CODE@__ Plural Genitive

!! Unmarked = Default, ie {{+CmpN/SgN}} for SMN.

!! The second part of the compound may require that the previous (left part) is:

+CmpN/SgNomLeft  !!≈ * __@CODE@__ Singular Nominative
+CmpN/SgGenLeft  !!≈ * __@CODE@__ Singular Genitive
+CmpN/PlGenLeft  !!≈ * __@CODE@__ Plural Genitive


!! !!Language tagged names

+OLang/ENG  		  !!= * @CODE@
+OLang/FIN  		  !!= * @CODE@
+OLang/NNO  		  !!= * @CODE@
+OLang/NOB  		  !!= * @CODE@
+OLang/SME  		  !!= * @CODE@
+OLang/SMA  		  !!= * @CODE@
+OLang/SWE  		  !!= * @CODE@
+OLang/UND  		  !!= * @CODE@
+OLang/RUS  		  !!= * @CODE@


!! !!Flag diacritics
!! We have manually optimised the structure of our lexicon using following
!! flag diacritics to restrict morhpological combinatorics - only allow compounds
!! with verbs if the verb is further derived into a noun again:
 @P.NeedNoun.ON@    !!≈ | @CODE@ | (Dis)allow compounds with verbs unless nominalised
 @D.NeedNoun.ON@    !!≈ | @CODE@ | (Dis)allow compounds with verbs unless nominalised
 @C.NeedNoun@       !!≈ | @CODE@ | (Dis)allow compounds with verbs unless nominalised
 @R.NeedNoun.ON@       !!≈ | @CODE@ | (Dis)allow compounds with verbs unless nominalised
!! 
!! For languages that allow compounding, the following flag diacritics are needed
!! to control position-based compounding restrictions for nominals. Their use is
!! handled automatically if combined with +CmpN/xxx tags. If not used, they will
!! do no harm.
 @P.CmpFrst.FALSE@ !!≈ | @CODE@ | Require that words tagged as such only appear first
 @D.CmpPref.TRUE@  !!≈ | @CODE@ | Block such words from entering ENDLEX
 @P.CmpPref.FALSE@ !!≈ | @CODE@ | Block these words from making further compounds
 @D.CmpLast.TRUE@  !!≈ | @CODE@ | Block such words from entering R
 @D.CmpNone.TRUE@  !!≈ | @CODE@ | Combines with the next tag to prohibit compounding
 @U.CmpNone.FALSE@ !!≈ | @CODE@ | Combines with the prev tag to prohibit compounding
 @U.CmpNone.TRUE@  !!≈ | @CODE@ | Combines with the two previous ones to block compounding
 @P.CmpOnly.TRUE@  !!≈ | @CODE@ | Sets a flag to indicate that the word has passed R
 @D.CmpOnly.FALSE@ !!≈ | @CODE@ | Disallow words coming directly from root.
 @D.CmpHyph.TRUE@  !!≈ | @CODE@ | Flag to control hyphenated compounds like proper nouns
 @U.CmpHyph.FALSE@ !!≈ | @CODE@ | Flag to control hyphenated compounds like proper nouns
 @U.CmpHyph.TRUE@  !!≈ | @CODE@ | Flag to control hyphenated compounds like proper nouns
 @C.CmpHyph@       !!≈ | @CODE@ | Flag to control hyphenated compounds like proper nouns
 @P.CmpHyph.TRUE@  !!≈ | @CODE@ | Flag to control hyphenated compounds like proper nouns
 @N.CmpHyph.TRUE@  !!≈ | @CODE@ | Flag to control hyphenated compounds like proper nouns

!! 
!! Use the following flag diacritics to control downcasing of derived proper
!! nouns (e.g. Finnish Pariisi -> pariisilainen). See e.g. North Sámi for how to use
!! these flags. There exists a ready-made regex that will do the actual down-casing
!! given the proper use of these flags.
 @U.Cap.Obl@        !!≈ | @CODE@ | Allowing downcasing of derived names: deatnulasj.
 @U.Cap.Opt@        !!≈ | @CODE@ | Allowing downcasing of derived names: deatnulasj.

! @P.Need3Part.ON@ @D.Need3Part.ON@ @C.Need3Part@ !3Part

@U.NeedsVowRed.OFF@ !!≈ * @CODE@ is used to force hyphenation/non-reduction: samediggi-
@U.NeedsVowRed.ON@  !!≈ * @CODE@ is used to force reduction w/o hyphen: samedigge#xxx
@C.NeedsVowRed@     !!≈ * @CODE@ Clearing this feature, so that it doesn't interfere with further compounding

@P.Px.add@	    !!≈ * @CODE@
@R.Px.add@	    !!≈ * @CODE@
@P.Px.block@    !!≈ * @CODE@
@D.Px.block@    !!≈ * @CODE@
@P.Nom3Px.add@
@R.Nom3Px.add@

@R.SpellRlx.ON@ !!≈ * @CODE@ Flag used to tag spell-relax-analysed strings (and only those).
@D.SpellRlx.ON@ !!≈ * @CODE@ Flag used to tag spell-relax-analysed strings (and only those).
@C.SpellRlx@    !!≈ * @CODE@ Flag used to tag spell-relax-analysed strings (and only those).

@R.SpaceCmp.ON@ !!≈ * @CODE@ Flag to tag compounds written with a space
@D.SpaceCmp.ON@ !!≈ * @CODE@ Flag to tag compounds written with a space
@C.SpaceCmp@    !!≈ * @CODE@ Flag to tag compounds written with a space


! =================================================
!! !!!Basic lexica, pointing to the other lexicon files
! =================================================

LEXICON Root   !!= @CODE@
 @U.Cap.Obl@ ProperNoun          ;
! !@U.Cap.Opt@ ProperNoun          ;
 ProperNoun-smi-nocomp ;
 NounRoot ;
 AdjectiveRoot ;
 VerbRoot ;
 VGen_verbs ;
 Adverb ;
 Particle ;
 Subjunction ;
 Conjunction ;
 Adposition ;
 Interjection ;
 Pronoun ;
 Numeral ;
 Acronym ;
 Punctuation ;
 Abbreviation ;

LEXICON ProperNoun   !!= * __@CODE@__ 
 Prefix-Proper ;
 ProperNoun-smn ; 
@N.CmpHyph.TRUE@ ProperNoun-smi-nocomp ; ! Lexicon for short names - always require hyphen
 ProperNoun-smi ;
! ProperNoun-smi-nocomp ;


LEXICON ENDLEX
!! !!!Lexicon @LEXNAME@
!! And this is the @LEXNAME@ of everything:
!! {{{
   @D.CmpOnly.FALSE@@D.CmpPref.TRUE@@D.NeedNoun.ON@ # ; !!≈ @CODE@
! @D.Need3Part.ON@ # ; !3part
!! }}}
!! The {{@D.CmpOnly.FALSE@}} flag diacritic is ued to disallow words tagged
!! with +CmpNP/Only to end here.
!! The {{@D.NeedNoun.ON@}} flag diacritic is used to block illegal compounds.