!!!Grammar checker meeting Nov. 11, 2013, Fran & Sjur * Sjur to test that the code in gramchk compiles Pending things: * Stabilise the layout of the data. What we have works, do we need any more files ? ** a bit early, we need to experiment more with the actual features we need ** at least we need a normative generator for the suggestions, possibly with dialect tags * Should we use a zip file, or just a flat directory layout. What should be the file names ? ** zip file for ease of distribution ** directly read the zip file as for the speller ** suffix: .zgram ** use an index file as for the speller * Try and develop the error.xml file a bit more: ** Add translations ** Check if it is sufficient * Tokenisation ** move the sentence splitting code from voikko core to language code ** do sentence splitting in CG, for language-dependent capitalisation errors and sentence boundary definitions ** also word-level tokenisation would need to be moved to the language code, preferably by the lookup tool *** either: extract list of punctuation from fst - Fran if lexc doesn't work out *** or: encode optional punctuation in LexC - Sjur * Code note: sentence/Sentence.cpp * Suggestions. Some existing tools and resources: * [http://extensions.libreoffice.org/extension-center/lightproof-editor] * [http://libreoffice.hu/2011/12/08/grammar-checking-in-libreoffice/] * [http://www.techrepublic.com/blog/five-apps/five-libreoffice-extensions-to-help-you-catch-grammar-problems/] * On/off interface for individual error checks ** talk to Harri about this -- probably wants to do the same for Finnish ** grouping errors into code - type ** perhaps use the settings dialog box for the extension to manage this. Integration plan: * LibreOffice (OpenOffice?) * MacOSX grammar/speller API * MS Office - probably through a VB or .Net wrapper (true Office integration is blocked) Try to lobby MS to allow grammar checkers etc for Greenlandic, Sámi (and others?)