!!!Meeting setup * Date: 30.6.2008 * Time: 09.30 Norw. time * Place: Internet * Tools: SubEthaEdit, iChat !!!Agenda Cf. one of the following, depending on context: * the upper bar of the SEE window (provided you use the JSPWiki syntax mode) * the TOC in Forrest-rendered output, like HTML and PDF !!!Opening, agenda review, participants Opened at 11:50. Present: __Børre, Jovsset, Sjur, Trond__ Absent: __Per-Eric, Thomas, Tomi__ Agenda accepted as is. !!!Updated task status since last meeting !!Børre * update svn user documentation as needed * try to repair G5 accounts for iCal Server * make a test-all target that runs all tests we have * define and document testing routines * fix the remaining hunspell conversion bugs * send new svn e-mail 23.6 as a reminder * change license on hunspell distros to GPL2+ * [fix bugs!|http://giellatekno.uit.no/bugzilla] !!Jovsset * follow up on {{sma}} corpus texts * translate leaflet text * Talk to __John Marcus Kuhmunen__ about layout, pictures for the leaflet. * Talk to __John Marcus Kuhmunen__ printing * Presentation on sami publisher meeting * distribute CD version through the library bus, the language centres and common sami centres in all of Sápmi !!Lene * get the ped content ready * Work on test routine with __Trond__ and __Sjur__ !!Per-Eric * follow-up on the {{smj}} texts from __Kurt Tore__ (__Per-Eric__) * follow-up contracts from ''Nord-Salten avis'' and __Lena Davidsson__ * Work with missing list same_dutkama_pgr.txt * Plan a {{smj}} pr tour for our tools * [fix bugs!|http://giellatekno.uit.no/bugzilla] !!Saara * add new XSL/XML headers for proofing test docs * Set up ways of adding meta-information for proofing correct corpus docs (source info, used in testing or not, added to lexicon or not) * implement the ped UI and functionality !!Sjur * update svn user documentation as needed * follow up on {{sma}} corpus texts * name db/risten.no * update the ''Changes'' document * follow-up on some Polderland-related bugs: 621, 630, 652 * InDesign documentation * make a test-all target that runs all tests we have * define and document testing routines * write leaflet text * [fix bugs!|http://giellatekno.uit.no/bugzilla] !!Thomas On vacation !!Tomi On vacation !!Trond * Set up Jabber for Lene * make a test-all target that runs all tests we have * define and document testing routines * Dictionaries * update svn user documentation as needed * prepare external users: Meänkieli and Greenlandic groups, Jack Ruether * [fix bugs!|http://giellatekno.uit.no/bugzilla]. !!!Pedagogical software online Meeting memos can be found at [http://giellatekno.uit.no/ped/index.html#Meeting+memos] __TODO:__ * get the content ready (__Lene, Biret, Trond__) * implement the UI and functionality (__Saara__) * get an easy-to-remember URL (__UiT/IT__) * More thorough skin, layout, ... (__External person within the Ped team__, __Internal forrest expert__) This we will postponed until later * Make a pedagogical speller (__Tomi__ when finished with his MA thesis) ** Turn off peripheral compounds (numbers, acros, perhaps names) ** Increase editing distance by one for suggestions? Only possible with limited compounding !!!Documentation __TODO:__ * start to reorganise the documentation (__Børre, Sjur, Trond__) !!!Corpus gathering __TODO:__ * follow up on {{sma}} corpus texts (__Sjur, Jovsset__) * follow-up on the {{smj}} texts from __Kurt Tore__ (__Per-Eric__) * other contacts: ''Nord-Salten avis'' (__Børge Strandskog__), Lena Davidsson daughter to Lars-Matto Tuolja * __Ulf Stefan Winka__ has a lot of {{smj}} texts (__Thomas__) * plan a {{smj}} pr tour for our tools (__Per-Eric, Thomas__) * meet with the Sámi publishers; main topics: (__Jovsset__) ** present the present project (Divvun II) ** discuss inclusion of our contract in theirs (DG and Čálliid Lágádus already positive) ** present the electronic dictionary sme-nob, with "inflectional intelligence" ** also regular contracts, and how to make faster progress with them, especially for the south sami and julev sami publishers since we have few texts in these two varieties. ** prepare meeting with __Børre__ (__Jovsset__) * make leaflet to inform about the project ** write text (__Sjur__) ** translate text (__Thomas, Jovsset, Maaren__) ** talk to __John Marcus Kuhmunen__ about layout, pictures (__Jovsset__) ** talk to __John Marcus Kuhmunen__ about printing (__Jovsset__) * distribute CD version through the library bus, the language centres and common sami centres in all of Sápmi. Gaaltije in Östersund for example. (__Leif Åge, Jovsset, Sjur__) !!!Future plans, directions and ideas See a separate document in {{plan/strat/5year.jspwiki}}. !!!Infrastructure To accomodate future enhancements in different directions (in rough order of importance): # test bench for all parts of our language technology efforts # migrate to svn # merge gt, kt and st into one, probably after the svn move # more modularised make / build infra (prepare for smn, sms, sjd, others) # close certain parts of the code repository (requires svn) # set up the Leopard Server features for collaborative support: ## permanent chat rooms ## stored (and indexed) chat transcripts of the chat rooms ## iCal server / group calendars ## wiki # wiki? (is part of Leopard Server) or other web-based documentation # improve Forrest stability and i18n support # reorganise the documentation content: ## differ between target groups ## get better grouping ## decide what to write in forrest and what in wiki (cf. Apertium for a similar split) ## update/add missing parts # migrate lexicons to XML, splitting the task ## Name lexica (the Name project) ## Dictionaries (already in XML, task is to integrate them) ## Open POSes (Komi as a test case) # change the look of the documentation web # sfst? Both as replacement for xfst and for hunspell/open-source proofing tools # investigate the NSIS installer, potentially replacing the InstallShield package from Polderland # corpus content moved to Max Planck repositories? SVN issues: * root/prooftools not available - fixed (somehow) * http access not yet available * read access to the whole repo is working, BUT: ** gt/smX/polderland should be protected * everything will be google-able by default if the repo URL is posted * plan should be protected? __TODO:__ * make a test-all target that runs all tests we have (__Børre, Sjur, Trond__) * define and document testing routines (__Børre, Sjur, Trond__) * add Jabber account in iChat for Lene (__Trond__) * follow-up migration to svn (__Børre, Sjur, Trond__) ** update user documentation as needed (__Børre, Sjur, Trond__) ** rewrite bashrc aliases geared towards cvs, if needed (__Trond__) ** prepare and discuss with external users: Meänkieli group, Greenlandic group, Jack Ruether (__Trond__) * try to repair G5 accounts for iCal Server (__Børre__) ** update the OS at the same time !!!Linguistics !!North Sámi (nothing new, see proofing bugs below) !!Lule Sámi (nothing new, see proofing bugs below) __TODO:__ * {{sme->smj}} lexicon conversion to build bilingual lexicon resources, and increase {{smj}} coverage (__Trond, Svenne__). * Add the words when all words are ready. !!South Sámi __Jovsset__ will ask the authors whether we can get a copy of the Verbh manuscript in electronic version, with the usual corpus contract. !!!Name lexicon/risten.no infrastructure __TODO:__ # fix i18n bug in risten.no/G5 (so they will work without the proper locale request) (__Sjur__) # fix bugs in lexc2xml; add comments to the log element (__Saara__) # finish first version of the editing (__Sjur__) # test editing of the xml files. If ok, then: (__Sjur, Thomas, Trond__) # make terms-smX.xml <=== automatically from propernoun-sme-lex.xml (add nob as well) (the morphological section should be kept intact, in e.g. propernoun-sme-morph.txt) (__Sjur, Saara__) # convert propernoun-($lang)-lex.txt to a derived file from common xml files (__Sjur, Tomi, Saara__) # implement data synchronisation between [risten.no|http://www.risten.no] and the cvs repo, and possibly other servers (ie the G5 as an alternative server to the public risten.no - it might be faster and better suited than the official one; also local installations could be treated the same way) # start to use the xml file as source file # clean terms-sme.xml such that all names have the correct tag for their use (e.g. @type=secondary) (__Thomas, Maaren, linguists__) # merge placenames which are errouneously in different entries: e.g. Helsinki, Helsingfors, Helsset (__linguists__) # publish the name lexicon on risten.no (__Sjur__) # add missing parallel names for placenames (__linguists__) # add informative links between first names like Niillas and Nils (__linguists__) !!Dictionaries __TODO:__ * clean up and generalise the make infrastructure * make Linux and Windows local/integrated versions * make simple installer applications * make a public release ** Make a homepage with instructions for dictionary use: {{xtdoc/gtuit/src/documentation/content/xdocs/dict.eng.xml}} ** Clarify the difference between local and online dictionaries: *** Plugin for Firefox and Internet Explorer (online dictionaries) !!!Proofing tools !!Hunspell __TODO:__ # change license on distros to GPL2+ (__Børre__) # QA README and installation docs (__Trond__) # fix the remaining conversion bugs (__Børre, Tomi__) !!Testing !Spelling Error Markup __TODO:__ * Set up ways of adding meta-information (source info, used in testing or not, added to lexicon or not) (__Saara__) * test new and nested error markup (__Sjur__) !!Speller bugs List of bugs returned from Polderland: 621, 630, 652, 656, 676. Open issues based on test results: !sme Version: __Davvisámi, version 1.0.1, 2008-06-02__ * 426 - comp words from Divvun.no - ''guoktedássásaš'' accepted - still __OPEN__ * 435 - roman numbers - inflection of single letter numbers rejected, as well as some complex numbers (but is ok in {{smj}}) - still __OPEN__ ** we should pregenerate all numbers once and for all, and store them in a separate lexicon file * 595 - prefix+name wihtout hyphen (''ovdaLot'' instead of ''ovda-Lot'') - still __OPEN__ * 600 - gen+hyph compound ''sámi-dáru'' - still __OPEN__ * 603 - suomabealdi accepted - still __OPEN__ * 606 - speller accepts VUOHTA compound - still __OPEN__ * 611 - double hyphen sugg still accepted - still __OPEN__ * 613 - short gen. as second compound part - still __OPEN__ * 619 - numerals and pronouns to NAMÁK and SASJ fails - still __OPEN__ * 627 - prefix + hyhpen does not get accepted - __FIXED__ * 629 - ''a'' taking part in compounding without hyphen - still __OPEN__ * 633 - __REGRESSION:__ double hyphens accepted * 634 - PropGen+hyph+PropGen - still __OPEN__ * 641 - numeral+noun compounds - still __OPEN__ * 642 - noun/adj/proper + hyphen + ain - still __OPEN__ * 644 - cased numeral+numeral compund - still __OPEN__ * 646 - adverb + hyphen + noun - still __OPEN__ * 647 - numerals+NOUN - still __OPEN__ * 648 - unmotivated suggestions with numeral+noun - still __OPEN__ * 649 - name + adj compound without hyphen - still __OPEN__ * 654 - speller does not recognize ordinals on -nuppelogát - still __OPEN__ * 655 - pron + nai - still __OPEN__ * 658 - Suggestion saame - still __OPEN__ - won't fix * 666 - guovtte- and njealje- - __FIXED__ * 676 - triple-hyphen - __FIXED__, but double hyphen is still accepted * other __regressions:__ ** ''skuvlajagin'' now accepted ** ''skierranis'' now accepted !smj Version: __Julevsáme, version 1.0.1, 2008-06-02__ * 435 - roman number - single letter numbers now recognised ** we should pregenerate all numbers once and for all, and store them in a separate lexicon file ** please note that ''inflection'' of single letter numerals is __fine__ in {{smj}}, as opposed to {{sme}} * 595 - prefix+name wihtout hyphen (''tsåhkeLot'' instead of ''tsåhke-Lot'') - still __OPEN__ * 599 - __REGRESSION:__ numeral attr:s on lot * 600 - gen+hyph compound ''sáme-dáro'' - still __OPEN__ * 616 - Bispadime-me-ráden - still __OPEN__, try to find an acro or abbr ''me'' * 619 - numerals and pronouns to NAMÁK and SASJ fails - still __OPEN__ * 629 - ''a'' taking part in compound - still __OPEN__ * 634 - rop gen + hyphen + Prop gen - still __OPEN__ * 641 - numeral+noun compounds - still __OPEN__ * 644 - cased numeral+numeral compund - still __OPEN__ * 647 - numerals+NOUN - still __OPEN__ * 648 - unmotivated suggestions with numeral+noun - still __OPEN__ * 649 - name + adj compound without hyphen - still __OPEN__ * 650 - noun prefix+name compound without hyphen - still __OPEN__ * 658 - Suggestion saame - still __OPEN__, won't fix * 692 - __NEW:__ - numeral-variants * other __regressions:__ ** ''gus'' NOT accepted anymore __TODO:__ * compile new speller lexicons (__Tomi__) * document how compounding is controlled in the PLX conversion (__Tomi__) !!Hyphenator bugs Open issues based on test results : !sme Lexicon version: __Davvisámi, version 1.0.1, 2008-04-01__ * 468 - __REGRESSION:__''Márkomeanu'' * 547 - __REGRESSION:__ hyphen in front of vowel: ''Lotnolasealáhusas'' * 548 - __REGRESSION:__ mid syllable hyphenation: ''Háliidivččen'' * 549 - __REGRESSION:__ division without hyph: ''Váccedettiin'' * 673 - adj-derivations: ''guovttenuppelotčoarvvagiin'' (the word is not rec.) * 677 - __NEW:__ Wrongly hyphenated ending -danidja - invalid !smj Lexicon version: __Julevsáme, version 1.0.1, 2008-04-01__ * 545 - __REGRESSION:__ bad hyphenation in compounds: ''åhpadusorganisásjåvnån'' (not recognised) * 546 - __REGRESSION:__ obligatory hyph rules seem to work in facultative manner: ''organisásjåvnån'' (not recognised) * 547 - __REGRESSION:__ hyphen in front of vowel: ''Jienastimnjuolgadusá'' and ''Orgánajs'' __TODO:__ * fix hyphenator errors (__Tomi__) !!InDesign tools Nothing new. !!Releases __TODO:__ * update the ''Changes'' document (__Sjur__) * InDesign documentation (__Sjur__) ** Norwegian translation received from Davvi Girji !!!Other !!Corpus contracts + open source Now decided to wait until we have changed from {{cvs}} to {{svn}}. !!Summer vacations || Who || When | Børre | 30/6-6/7, 21/7-3/8, 11/8-17/8 | Jovsset | 7-11/7, 21/7-8/8 | Per-Eric | 11/6 - 30/6 | Sjur | 7/7 - 1/8 | Tomi | 16/6 - 4/8 | Thomas | 23/6 - 18/7 | Trond | 30/6 - 18/7, 28/7 - 1/8 !!!Next meeting, closing The next meeting is 11.8.2008, 9.30 Norwegian time. The meeting was closed at 13:18. !!!Appendix - task lists for the next five days !!Boerre [iCal|/doc/admin/weekly/2008/Tasks_2008-06-30_Boerre.ics] * update svn user documentation as needed * try to repair G5 accounts for iCal Server * make a test-all target that runs all tests we have * define and document testing routines * fix the remaining hunspell conversion bugs * send new svn e-mail 23.6 as a reminder * change license on hunspell distros to GPL2+ * [fix bugs!|http://giellatekno.uit.no/bugzilla] !!Jovsset [iCal|/doc/admin/weekly/2008/Tasks_2008-06-30_Jovsset.ics] * follow up on {{sma}} corpus texts * translate leaflet text * Talk to __John Marcus Kuhmunen__ about layout, pictures for the leaflet. * Talk to __John Marcus Kuhmunen__ printing * Presentation on sami publisher meeting * distribute CD version through the library bus, the language centres and common sami centres in all of Sápmi !!Lene * get the ped content ready * Work on test routine with __Trond__ and __Sjur__ !!Per-Eric [iCal|/doc/admin/weekly/2008/Tasks_2008-06-30_Per-Eric.ics] * follow-up on the {{smj}} texts from __Kurt Tore__ (__Per-Eric__) * follow-up contracts from ''Nord-Salten avis'' and __Lena Davidsson__ * Work with missing list same_dutkama_pgr.txt * Plan a {{smj}} pr tour for our tools * [fix bugs!|http://giellatekno.uit.no/bugzilla] !!Saara * add new XSL/XML headers for proofing test docs * Set up ways of adding meta-information for proofing correct corpus docs (source info, used in testing or not, added to lexicon or not) * implement the ped UI and functionality !!Sjur [iCal|/doc/admin/weekly/2008/Tasks_2008-06-30_Sjur.ics] * update svn user documentation as needed * follow up on {{sma}} corpus texts * name db/risten.no * update the ''Changes'' document * follow-up on some Polderland-related bugs: 621, 630, 652 * InDesign documentation * make a test-all target that runs all tests we have * define and document testing routines * write leaflet text * [fix bugs!|http://giellatekno.uit.no/bugzilla] !!Thomas On vacation !!Tomi On vacation !!Trond [iCal|/doc/admin/weekly/2008/Tasks_2008-06-30_Trond.ics] * Set up Jabber for Lene * make a test-all target that runs all tests we have * define and document testing routines * Dictionaries * update svn user documentation as needed * prepare external users: Meänkieli and Greenlandic groups, Jack Ruether * [fix bugs!|http://giellatekno.uit.no/bugzilla].