Localisation
Northern Sámi
At present (jan 02), the Northern Sámi project is run in a
7-bit fashion, with digraphs (a1, c1, d1, n1, s1, t1, z1) for the 7
Sámi letters. This is an ad hoc solution. We hope to migrate
either to UTF-8 format, or to an 8-bit-format, either ISO-IR 197 or
Latin 4. Both Linux localisers and the Xerox tools manuals boast
UTF-8 compatibility, thus, in theory, this should be possible. Still,
Xerox advices us to use an 8-bit-solution internally. This must be
sorted out.
Should we go for UTF-8, the following must be in place:
- The Linux/Unix platform of the project must be UTF-8 enabeled
- We must find out how the Xerox tools handle UTF-8 #in practice#.
- We must make a Northern Sámi keyboard for UTF-8
- Existing files must be converted to UTF-8
Latin 4 or ISO-IR 197, the two 8-bit code tables are both supported by
iconv, and both contain the required symbols. Of the two, Latin 4
might be better supported, but ISO-IR 197 is a true superset of the
alphabetic repertoire of Latin 1, and should thus give no
compatibility problems with Latin 1 input. The general name files use
virtually every Latin 1 character, the ISO-IR 197 compatibility with
Latin 1 makes it better suited than Latin 4. Furhermore, ISO-IR 197
makes it possible to represent Skolt and Inari Sámi as well. In
the long run, Unicode and UTF-8 is still the desired output, and
migrating directly from 7 bit to UTF-8 seems a better
solution. Crucial is Emacs support, shells, etc.
Lule and Southern Sámi
Southern and Lule Sámi are adequately represented in 8-bit
format, by Latin 1 (the Lule Sámi files use ñ for
n-acute). These files can be carried over both to ISO-IR 197 and (?)
Latin 4, or they can be kept in Latin 1.
Inari and Skolt Sámi
Inari Sámi may be represented in the current 8-bit format, with
the vowels directly in 8-bit Latin 1 and the consonants as digraphs c1
etc.
Note that it would be much harder to represent Skolt Sámi the
way sketched for Inari Sámi. The localisation issue should thus
be solved before Skolt Sámi can be included.
Last modified: Fri May 17 00:10:06 CEST 2002