Plan for the samest project, 2014 - 2016 !!!FST !!Estonian FST - 2013 - 2015 This comes before both Oahpa and MT implementation * Revise the plamk fst or integrate it in the gt infra ** Degree of adjustment of fst (discuss: Jaak, Heli, Trond + Sjur, soon after nov 18th) ** Revision -- Tag adjustment * __Goals__: Ability to generate Oahpa, MT, Dictionaries !!Võru FST - 2013 - 2015 * Oahpa quality: generating the pedagogical lexicon ** 70 stem classes, 2m to arrive at generating them !!Work procedure FST group now in the beginning, discussing both vro and est (Sulev, Jack, Jaak, Neeme, Heli, Trond, Heiki). Goal: * Select, make and proofread yaml files for testing (est and vro) * Teach fst writing * Learning by doing * Later on leaving FST writing to the primary linguists !!!MT !!Extracting bilingual dictionaries # From corpora ## sme (gt) + fin (gt, hy) + est (filosoft) lemmatisation software in place ## corpora in place? Opus is in place (more sme-fin? Bible, other stuff from gt corpus?) ## TODO1: Set up Giza++ and Moses and do the alignment (Zürich) ## TODO2: Revise output with UPLUG + redo the tuning + find the best machine output (1+1 m) ## TODO3: Do the manual revision of the bilingual dictionaries (1+1 m) # From existing dictionaries ## (fin-rus/eng + rus/eng-est etc) (tasks 2.-4. 1 m) ## From WordNet, and from Wiktionary !!Saami-Finnish MT Alignment !!Finnish-Estonian MT The application said * Evaluation of Oahpa by teachers and students - 2015-2016 * Publication of results at conferences - 2014 - 2016 Task order: # start with alignment # start with fst adjustment # start with getting some dictionary, e.g. nob-est, est-nob (additional nicety) !!!Oahpa * Setting up Estonian and Võro Oahpa - 2014-1 * Numra est, vro 2014-1 * Leksa - 2014-1,2 * Morfa-S - 2014-1,2 Morfa-C - 2015 We will also give Vasta and Sahka a go !!vro oahpa # set up 2014-1, \\ Prototypes of Morfa-S, Leksa, Numra: 2014-2 \\ Morfa-s: fst to generate forms of vocabulary \\ Leksa: vocabulary + semantic markup # Use in courses, feedback, adjustment: 2014-3,4 Later: Morfa-c, further work Teachers * est: Ilona and others * vro: Sulev is the teacher + collegues in Võro Institute + TÜ Time schedule: * 2013-4 Estonian fst (discussion, adjustment) , Võro fst: approaching oahpa quality * 2014-1 Set up Oahpa for est, vro * 2014-2 Work with Oahpa for est, vro; work with fsts * 2014-3 Oahpa in courses: vro, est * 2014-4 Oahpa in courses: vro, est * 2015 It will be planned later * 2016 It will be planned later !!!Resources: People, tasks, time allocated !! Time - month schedule || Task || 2013 || 2014 || 2015 || 2016 || Persons | Estonian FST | 0.5 | 3 | 2 | 2 | Jaak, Heli, Heiki | Võro FST | 0.5 | 3 | 3 | 3 | Sulev | CG | 0 | 0 | 2 | 2 | Tiina, Kadri | Oahpa L, N, M | 0 | 5 | 2 | 1 | Heli, teachers | Oahpa V, S | 0 | 0 | 2 | 3 | Heli, teachers | W Extr. fi-et | 0 | 3 | 0 | 0 | Mark Fishel, Kaarel Veskis, Katrin Tsepelina | W Extr. fi-se | 0 | 2 | 0 | 0 | To be decided | MT fin-est | 0 | 0 | 2 | 2 | - | MT fin-sme | 0 | 0 | 1 | 1 | - | MT fi CG | 0 | 1 | 0 | 0 | - | Total | 1 | 17 | 14 | 14 | 51+ * Weighting, tasks: EstFST + VroFST + Oahpa + MT + Admin = as above * Weighting, years: 2014, 2015, 2016 = quite even !! People || Persons || 2013 || 2014 || 2015 || 2016 || Role | Heiki-Jaan Kaalep | 0 | 1.5 | 1 | 1 | leader, FST | Kaarel Veskis | 0 | 1.5 | 0 | 0 | Evaluation | Katrin Tsepelina | 0 | 1.5 | 0 | 0 | UPLUG setup | Mark Fishel | 0 | 0 | 0 | 0 | Corpus, giza++, moses setup | Sulev Iva | 0.5 | 3 | 3 | 3 | fst | Jaak Pruulmann-Vengerfeldt | 0.5 | 1.5 | 1 | 1 | fst | Tiina Puolakainen | 0 | 0 | 1 | 1 | cg | Kadri Muischnek | 0 | 0 | 2 | 2 | cg, MT | Tarmo Vaino | 0 | 1 | 1 | 1 | guru | Heli Uibo | x | 4 | 4 | 4 | Oahpa, FST | Teachers | 0 | 1 | 1 | 1 | Oahpa | Saami to be decided | 0 | 2 | 1 | 1 | W Extr, MT | Sum | 1 | 17 | 15 | 15 | 48 (cf. 51.2) !!!Travel !!Meetings Startup in Tromsø in january/february 2014 Meetings in Tartu Mid-project in Other expenses Laptops 2000 3 6000 Travel 2000 13 31200 Conferences 5000 4 24000 61200 !!Workshops ICALL workshop on Oahpa-related things in cooperation with Tromsø ICALL group !!!Papers Brainstorming: * The "See what I have done" - paper ** Võru fst ** Estonian Oahpa ** MT comparision RBMT vs. SMT * Specific problems papers ** cross-language FST comparisons ** Papers based upon user logs ** Papers based upon specific problems during our work