Documenting the Skolt Sami Rule file
Introduction
xfst vs. twolc
The Skolt Sami rule file differs from the other languages in being an
xfst file, rather than a twolc file (each format have different
chapters in the pdf version of the Xerox book). The differences
between thesew two formats may be summarised as follows:
- xfst rules are ordered, whereas twolc rules apply simultaneously
- xfst rules stand in a feeding-bleeding relation to each other, whereas twolc rules do not
- xfst rules are good at doing many-to-many replacement operations (also of different-size strings), whereas in twolc one must have one rule for each pair of members in equal-length strings
- xfst cannot generalise over ordered sets of pairs, thus the Cx:Cy replacement of members of ordered sets Cx and Cy that is found in twolc must be done in a more clumsy way in xfst.
The reason why xfst is chosen for Skolt Sami (and in the future also
for Inari Sami(?)) is that there are several cases of many-to-many
replacements in the Skolt Sami grammar.
The structure of the rule file itself
The rule file for Skolt Sami is divided into 5 main types:
- Definitions
- Consonant shift rules (tbw)
- Vowel alternation rules
- Consonant gradaton rules
- Rules for cleaning up and composing end result
The structure of this document
The next section presents a general discussion on the rule
format of xfst. Thereafter, the rule types of the rule file will be dealt
with, in the order indicated in the 5-point list given above.
Rule format and the cascade of rules
Xfst rules are of the following format: interface offers 4 different operator
The definitions
There is no section corresponding to the Alphabet section of
twolc. The sets are defined as such by means of the operator OR
(written "|" ).
The consonant shift rules
This section will include a.o. word-final consonant changes
The vowel alternation rules
Last modified: Tue Mar 4 12:35:30 GMT 2003