Documentation of Southern Sámi rules


This rule formalism is fundamentally different from the one utilized for Northern Sámi. Whereas Northern Sámi is modeled as a lexical transducer, Southern Sámi uses Koskenniemis original morphophonological approach.

The Alphabet section

The alphabet consists of the letters, the English alphabet, the Norwegian and Swedish letters (they are used interchangibly), and the letter ï. The Norwegian and Swedish letters are used interchangeably, and i is often used for ï. The present system does not account for that. A parser aimed at parsing real texts must incorporate these conventions. The two-letter symbols A1 etc. are used as indicated below:

 A1 = illative singular vowel
 A2 = ending vowel in words like 'maana'
 A3 = second vowel in words like 'daktere'
 A4 = first vowel in words like 'jeptsie'
 E1 = first vowel in words like 'sjiellie'
 U2 = ending vowel in words like 'nïejte'
 I1 = ie in all cases except ending vowel in words like 'gåetie'
 I2 = ending vowel in words like 'gåetie'
 I3 = first vowel of many case endings
 I4 = ending vowel on all nouns on three syllables, like 'gierehtse'
 O1 = oe in all cases except ending vowel in words like 'bearkoe'
 O2 = ending vowel in words like 'bearkoe'
 U1 = first vowel in words like 'njueslie'
 Æ  = first vowel in words like 'klihtie'
 Å1 = first vowel in words like 'gullie'
 Å2 = first vowel in words like 'gaevlie'
 Å3 = first vowel in words like 'gåetie' when umlauting in plural
 Å4 = first vowel in words like 'gåetie' when not umlauting in plural
 D1 = possible doubling of preceeding consonant
 ... more to come for sure ;)

Rules section

Cf. the NJL article, and Karttunen's (written) comments.

There is one spurious issue: The D1. According to the legend, it is used for doubling the prexeeding consonant. This eems to be the case in the lexicon as well, where it is found in the Inessive Singular suffix, a suffix that has the form -sse for trisyllables and -se for bisyllables. The problem is I cannot find any D1 rule to that effect. Wherever it is, it gives us two illative suffixes for both bi- and trisyllables, in stead of the intended distribution.


jaevrie pl Acc claims jeevride should be jaevride

Same error for gaerie

It seems it does not recognise nouns. bielkie, biehkie, etc. Others work, e.g. jiekie !!??

The error could lie in the different encoding in the lexicon, cf. above. Perhaps we should have had "bE1hkI2", patterning with "jiekie".

jiekie:jE1kI2 N_IE; 
biehkie:bI1hkI2 N_IE; 
biejjie:bI1jjI2 N_IE; 
bielie:bI1lI2 N_IE; 

The syllable counting

Our original rule file managed to count syllables, and alternate between -se and -sse in the Illative endings. Karttunen's version does not do that.

Last modified: Thu Nov 22 14:47:38 CET 2001