!!!Meeting with Polderland 2.10.2007 Participants: * Peter Beinema * Sjur Moshagen !!!Agenda * Since last time * Possible issues * Next meeting * Todo items for the next meeting !!!Since last time !!Polderland * dropped: new version of MS windows installer, incl. Lule Sami hyphenation * pending, working on new installer (bugs 455 and 421): Mac installer for Office, including Lule Sami Hyphenator * dropped: North Sami hyphenator for Mac InDesign * pending: Lule Sami hyphenator for Mac InDesign: problem with language identification when combined with North Sami hyphenator * fixing bugs: some bugs pending (461, 521, 522) * compounding behaviour: ** 3/4/5-part compound appear to work in general, but not in some specific cases *** investigating costs of modification of PLX flags. See some ideas below. * mac speller: installation issue (admin rights)? * zip utility for windows: not functional yet ** first part works, calling REAL language-dependent installer appears to crash ** have to do some debugging there * Speller behaviour: only suggests if errors are limited to 1 part of compound. ** undesired (or should be user setting), ** will be changed in future drop Ideas for PLX flag modifications: {{{ L + R possibilities are limited L: anything but rightmost R: anything but leftmost considered: B: leftmost only E: rightmost only ? (not M, in any case): anywhere but leftmost or rightmost }}} Development costs are probably prohibitive, though, first estimate is over 40 hrs of work. It would not be a ~Sami-specific change, but was not planned. ==> will investigate further, get detailed estimate !!Divvun * testing, bug hunting and fixing during the whole week in Tromsø, major linguistic bug fixes. The quality is approaching release quality for North Sámi, except for N-part compounds. Lule Sámi ok, but not as good as North, mainly due to lack of corpus texts * major remaining bugs relate to limits in the PLX formalism: ** adj+noun compounding do not follow the same rules as noun+noun or adj+adj (which means we overgenerate in some cases, and undergenerate in others) *** will be further investigated by the Divvun team ** N-part compounds either overgenerate in a bad way, or severely undergenerate !!Bug status !Solved/cancelled ||Div ||Pld ||Description | 402 | 439 | word + hyphen + comma is not recognised ("xxx-,") => solved | 419 | 440 | 3-part compounds not recognised => solved => solved | 448 | 444 | Upper-case words get suggestions with Initial Case => solved | 449 | 445 | suopmasápmelaš-type compounds accepted by the speller => limits of tagging /compounding mechanism reached !To be solved in next drop ||Div ||Pld ||Description | 455 | 446 | Mac uninstaller doesn't work if run by a non-admin => solved | 473 | 452 | Windows installer does not autostart after download => discuss solutions | 516 | 4?? | vista installation from zip fails => probably solved by self-extracting zip; not functioning yet !Under investigation by Polderland ||Div ||Pld ||Description | 461 | 448 | Spelling errors with editing distance 1 from lexicalised words do not get correct suggestions => explanation not good enough, investigate more samples *** still investigating | 480 | 455 | Speller suggestions not identical in context menu and dialog => MS word behaviour: sorting of same-penalty suggestions; => no plans to repair this one | 521 | xxx | Mac installer only works for admin accounts => | 522 | xxx | Strange compounding fenomena !To be investigated by Divvun/Users ||Div ||Pld ||Description | 447 | 443 | Windows speller doesn't install on terminal server => SD IT guys will have a look once more !!!Issue discussion !!473 - Windows installer Alternatives to solve it: * make an executable installer file - requires InstallShield, meaning that either Polderland will have to make each installation package, or that the Divvun project will have to get an InstallShield license * make self-extracting zip files for download, with autostart options of the extracted objects; some alternative zippers for this: For both cases different types of protection software (firewalls, anti-virus, etc) might block either download or execution of the installer. But default InstallShield installers behave in the same (or a similar) way, so that should not be any different. !!Speed With more than 2-part compounds now available, suggestion speed is sometimes very slow, in the range of 10 seconds. !!Casing of suggestions Example of acro words: {{{ INTERREG Inflected: INTERREG:s (ie lower-case case endings in normal usage) user types: INTERREG:ii Suggestion: INTERREG:I Expected: INTERREG:i (= lexicon entry) lexicon has: INTERREG:i }}} !!!Rough schedule Polderland didn't meet their self-imposed goal of delivering at the beginning of September, will try around mid September instead. * Mid September: planned final drop from Polderland * November 1: Divvun code freeze * November 15: Indesign speller * December 11: public release !!!Next meeting Next week (9.10.) at the usual time. !!!TODO: * __PLD__ drop first version of indesign Hyphenator - done * __PLD__ pass information on self-extracting ziptool with autostart option + instructions on how to use it (not in working order yet) - pending * __PLD__ Proposal for command line hyphenator: 12/9 ready to be sent by commercial dept - sent * __PLD__ do speed tests: Mac/PPC Mac/c2duo Win, Office 11 Office 12 (compounding especially) - interpreter issue (Rosetta): much faster on much slower systems when not interpreted, factor of 20+ Office 2008 has native intel code, will be much faster PLD code is universal binary, but depends for speed on calling method * __PLD__ provide information on InDesign language groouping * __Divvun__ investigate further the A+N compounds * __Divvun__ add Acro case ending casing to Bugzilla (INTERREG:i)