The main lex file Separate lex files for different POS (parts of speech) |----------------------| |------------------| | sme-lex.txt | | noun-sme-lex.txt | | | | viessu GOAHTI ; | From the Root lexicon, there | Root -------------> | ... | are pointers to each POS. | | | | | The files for nouns, verbs and | LEXICON GOAHTI <---------------| | adjectives point back to the | +N DEVNVCASE ; | | | sme-lex.txt file, and are di- | ... | |------------------| rected to their respective | | sublexica. | | |-------------------| | ---> | verb-sme-lex.txt | (the auxiliary verbs are | <--------- ... | also found in the verb file) | | |-------------------| | | | | |-------------------| | ---> | adj-sme-lex.txt | | <--------- ... | | | |-------------------| | | | | | ---> |-------------------| The other lex files contain | <- - - - - closed-sme-lex.txt| closed classes. They are smal |----------------------| | LEXICON Pronoun | ler, and all the sublexica | Personal ; | are in the same file, not in | | the sme-lex file (well, some | LEXICON Personal | point to some sme-lex sub- | ... | lexica). Other files are pp- |-------------------| lex.txt, etc. All in all there are ca. 10 lex files. This is compiled together with the || twol rules. These rules contain the || (morpho)phonological processes, || consonant gradation, etc. \/ |------------| |------------| |------------| The sme.save file is |twol-sme.txt| => |twol-sme.bin| => | sme.save | compiled in lexc, and |------------| |------------| |------------| is the merger of the lex files and the rule Here are the After compi- file twol-sme.bin rules them- lation in twolc || selves they are in this || binary file || || Then comes preprosessor files: \/ |----------| |------------| ||=========|| This is the final morpho- |case.regex| => |caseconv.fst| ======> || sme.fst || logical parser for |----------| |------------| ||=========|| Northern Sami. The case.regex file is com- piled in xfst. The preprocessor itself, tok.fst, is not shown here.