sme-sma-mt meeting 12.8.2013
Francis, Lene, Trond.
!!!Agenda
* Evaluation
* Plan, overall principles
* Analysis
* Linguistic transfer issues
**Px
**Inflected forms
**Numerals
**Lexical selection
**Px
**...
* Generation
!!!Evaluation
The abstract and hence the plan:
* Show sme2sma as a pilot, that it is feasible.
Choose a narrow domain.
Evaluation procedure
# Send text pairs to sma translators: sme2sma and nob.
(the sme2sma text perhaps enriched with missing words)
## Which is quicker: editing sma MT vs translating from scratch
## Method: giving two texts: one to translate and one to edit
Two texts to three translators: one nob-only, one nob + smaMT
# Questions:
## Time the task
## Answer question: How did you like the smaMT text?
## hypothesis: smaMT has a less Norwegian syntax, and this can
be seen as an asset (?)
There is a similar study evaluating es2pt, giving pt translators
an en original and a es2pt MT text. Here is the
paper:
"Using the Apertium Spanish-Brazilian Portuguese machine translation system for localization".
François Masselot, Petra Ribiczey (both Autodesk) and Gema Ramírez-Sánchez (Prompsit)
Annual Conference of the European Association for Machine Translation in 2010.
Content:
* 2 articles, each one or two pages
* 3 translators
!!!Plan, overall principles
Content:
# sme: Improve the analysis (syntactic functions...)
# sme-sma texts: pick words, add words
# sme-sma mt-tests: improve the syntax, morphosyntax
# sma: Improve the generation (double forms, ...)
## Worst-case-fix: word1/word2 => word1
# sma and sme: add missing words to fst
# CG-rules for lexical selection
# Improve/finish sme/src/smi-syn.rle (the file is temporarily in sme/src/)
Online:
* [https://gtweb.uit.no/mt/]
* Update:
** gtweb: {{/opt/mt/README}}
Apertium Wiki:
* [http://wiki.apertium.org/wiki/Talk:North_S%C3%A1mi_and_South_S%C3%A1mi]
__Deadlines:__
* Find texts
* Find translators
* 30.8. Send texts to the translators
* 15.9. Receive evaluation from the translators
* 26.9. Conference
!!!Analysis
sme-dis.rle vs. Old-sme-dis.rle
Some syntactic tags are missing. Linda used
syntactic functions in her rules.
Lene will spend a day or two on that.
We do not use dependency.
Evaluate Francis' tag conversion: Analyse the same
sme text with identical morphology, and identical
dis, but one with gt tags and one with Fran's converted
apertium tags.
Francis to look into that and report differences.
!!!Linguistic issues
!!Inflected forms
Two ways of translating positive adjectives in the attributive:
# to adjective (attr -> attr)
# to a noun in the genitive
Here are the cases:
1)