Apertium sme-sma 28.8. Present: Francis, Lene, Trond. !!!Agenda: * Tag reording * Other issues !!!Tag reordering {{{ bidix fst result --------------------------------------------------------------- skuvla .INTERSECT. skuvla = () skuvla .INTERSECT. skuvla = (skuvla) skuvla .INTERSECT. skuvla = (skuvla) --------------------------------------------------------------- }}} But:


solutions: # move tags # delete tags So: We do the following # Reshuffle tags in fst: ## prop tags are as they are from fst ## noun are reordered by a script in sme/src/filters ## adverbs are as they are from fst # the tags are then matched for the pruning of the fst in apertium # The bidix does not contain semtags # The cg in Apertium removes the tags here is the explicit description of what we want: {{{ --------------------------------------------------------------- fst: ... after reshuffle ... from the fst cg: semtags in, nosemtags out bidix: no sem-tags --------------------------------------------------------------- }}} !!!Other issues !!jagi 1974 {{{ Son lea riegádan "jagi 1974" ja bajásšaddan "Romssas". Dihte "jaepien 1974" reakedi jih "Tromsene" byjjesovvi. ^Son/Son<@SUBJ→>$ ^lea/leat<@+FAUXV>$ ^riegádan/riegádit<@-FMAINV>$ ^jagi/jahki<@←ADVL>$ ^1974/1974<@N←>$ ^ja/ja<@CNP>$ ^bajásšaddan/bajásšaddat<@-FMAINV>$ ^Romssas/Romsa<@←ADVL>$^../..$ $ echo "Son lea riegádan jagi 1974 ja bajásšaddan Romssas." | apertium -d . sme-sma Dïhte jaepien 1974 reakadamme jïh Tromsøesne byjjenamme. ^pron-pers<@SUBJ→>{^Dïhte$}$ ^v<@+FMAINV>{^reakadidh$}$ ^n-num<@←ADVL>{^jaepie$ ^1974$}$ ^cc<@CNP>{^jïh$}$ ^prfprc<@-FMAINV>{^byjjenidh$}$ ^n<@←ADVL>{^Tromsø$}$^sent<@X>{^..$}$ }}} {{{ "" "son" Pron Pers Sg3 Nom @SUBJ> "" "leat" V IV Ind Prs Sg3 @+FAUXV "" "riegádit" V IV PrfPrc @-FMAINV "" "jahki" N Sg Gen @" "1974" Num Sg Nom @N< "" "ja" CC @CNP <==================== @CVP would tell that there is coming a new finite verb "" "bajásšaddat" V IV PrfPrc @-FMAINV "" "Romsa" N Prop Sg Loc @" "." CLB }}} Links: * [http://wiki.apertium.org/wiki/North_Saami_-_South_Saami_syntactic_issues] * [http://wiki.apertium.org/wiki/North_Saami_-_South_Saami_morphological_issues] * [http://wiki.apertium.org/wiki/North_Saami_-_South_Saami_bilingual_lexicon] {{{ Before: $ echo "Nieiddat leat čeahpit. " | apertium -d . sme-sma Nïejth leah væjkelh. After: $ echo "Nieiddat leat čeahpit. " | apertium -d . sme-sma Nïejth leah væjkele. }}} {{{ ^buot/buot<@→N>$ ^gielaid/giella<@←OBJ>$ ^attr-n<@←OBJ>{^gaajhke$ ^gïele$}$ gaajhkide gïelide }}} {{{ echo "Mun boađán boahtte jagi." | apertium -d . sme-sma Manne båetije jaepien båatam. båetije båetedh+V+IV+PrsPrc båetije båetedh+V+IV+Der/NomAg+N+Sg+Nom }}} Here is a solution


* ND = number to be determined (take it from the noun) * CD = case to be determined (take it from the noun) Lexicon entry form: {{{



}}} Now: {{{ ^pron-pers<@SUBJ→>{^Manne$}$ ^v<@+FMAINV>{^båetedh$}$ ^a-n<@←ADVL>{^båetije$ ^jaepie$}$^sent<@X>{^..$}$ $ echo "Mun boađán boahtte jagi." | apertium -d . sme-sma Manne #båetije jaepien båatam. }}} So, what is the role of ''båetijen''? {{{ boahtte jagis -> båetijen jaepesne attr loc gen.attr ine båetije båetije+A+Attr <=== remove Attr and give cases båetijen båetije+A+Gen+Attr båetije båetije+A+Sg+Nom båetijen båetije+A+Sg+Acc Use/MT båetijen båetije+A+Sg+Gen båetijen båetije+A+Sg+Ine båetijen båetije+A+Sg+Ela båetijen båetije+A+Sg+... sme: buori buorre+A+Sg+Gen buori buorre+A+Sg+Acc buorre buorre buorre+A+Sg+Nom boahtte boahtte+A+Attr => båetije/båetijen Pron+Dem @→N = +Det+Dem (?) Pron+Dem +Attr @→N = +Det+Dem (?) }}} We then have four types: !indecl sme -> decl sma {{{



}}} !indecl sme -> indecl sma - !decl sme -> decl sma - !decl sme -> indecl sma {{{



}}} * vihkeles+A+Superl+Sg+Nom vihkelommes * vihkeles+A+Sg+Acc vihkelem He sees the X cat. {{{ in langs/sma: ./configure --with-hfst --enable-apertium --enable-oahpa the file is sma/src/morphology/*/adjectives-oahpa.lexc }}} !!Lexical selection ''boahtit oidnosii -> våajnoes sjïdtedh'' Default translation for boahtit is båetedh, alternative translation is sjïdtedh {{{



}}} apertium-sme-sma.sme-sma.lrx {{{ }}} ^boahtit<@+FMAINV>/båetedh<@+FMAINV>/sjïdtedh<@+FMAINV>$ ^oidnosii<@←ADVL>/våajnoes<@←ADVL>$ ^Dat/dat/dat$ ^boahtá/boahtit$ ^oidnosii/oidnosii$^./.$ #Artihkele 7:m *geatnegahttá nasjonaalestaatide tjïrrehtidh konkreetide råajvarimmide vaarjeleminie åvteste unnebelåhkoengïeli nimhtie ahte dah våajnoes sjidtieh dovne politihkesne, laakine jïh åtnosne. ^pron-dem<@SUBJ→>{^dïhte$}$ ^v<@+FMAINV>{^sjïdtedh$}$ ^adv<@←ADVL>{^våajnoes$}$ <@←ADVL> $ bash dev/test-prefix.sh "juoga" !!error test report {{{ sh generation-errors.sh | grep '#' | grep '$ #jïjnje\\\\\ ^jïjnje/jïjnje
/jïjnje/jïjnje/jïjnje$^./.$ 2 ^måedtie$ #måedtie\\\\ ^måedtie/måedtie$^./.$ 2 ^aktege$ #aktege\\ ^aktege/akte/aktege$^./.$ 1 ^naan$ #naan\\\\ ^naan/naan$^./.$ 1 ^naan$ #naan\\\\ ^naan/naan$^./.$ 1 ^dagkeres$ #dagkeres\\\\\ ^dagkeres/dagkeres/dagkeres$^./.$ }}} !! SVO - SOV with adjectives SVO to SOV, but also S V AN to S AN V * Son oaidná skuvlla. Son oaidná ođđa skuvlla. ** Before the fix: Satne skuvlem vuajna. Satne vuajna orre skuvlem. ** After the fix: Son oaidná ođđa skuvlla. Dïhte orre skuvlem vuajna. {{{ ^pron-pers<@SUBJ→>{^Dïhte$}$ ^v<@+FMAINV>{^vuejnedh$}$ ^a-n<@←OBJ>{^orre$ ^skuvle$}$ ^sent<@X>{^..$}$ }}} The trick was to define a ''phrase'' (SN) in the t1x file, and then refer to that phrase later.