!!!Speller test result web app - first specification Goals: * search test results for error types * compare test results between versions (in a broad sense, details below) * compare languages * graphical and textual displays of test results and comparisons * graphical and textual display of error type search/query Front page: * list of all languages (dynamically generated based on available data) * last option: all/some languages compared For each language: * list all available speller engines (dynam. generated based on avail. data) * last option: compare all spellers For each speller: * two sets of main test results: ** precision/recall/accuracy ** suggestion quality * list all error types (dynamically generated) ** each type is a link to a textual and graphical presentation of test results for that error (suggestions only or false negative (= undetected)) * query form: ** select an error type from a dynamically generated list Version comparison: There are several possible variables: * different numbered versions (e.g. 1.1 vs 1.2) * different dates of development versions (e.g. 1.3dev 2014-10-01 vs 2014-10-10) * different languages for the same speller engine * different engines for the same language * all engines for all languages (the mega-big comparison :) ) (?) Graph types: * non-comparing graphs: ** precision & recall: vertical bar chart (one each for precision/recall/accuracy, 0-100%) ** suggestion quality overview: stacked bar, colours like the present graphs ** suggestion quality details: multiple stacked bars, one for each error type * comparing graphs: ** precision & recall: as above but in groups of three, one group for each ver. ** suggestion quality overview: one stacked bar for each version ** suggestion quality details: no comparison for now (we wait with this) Graphing done using [http://dc-js.github.io/dc.js/] Textual data produced directly by eXist/XQuery as xhtml. Textual data: * summary tables (cf the present speller test result pages [here|https://giellalt.uit.no/proof/spelling/testing/sme/pl/goldstandard/latest-GoldstandardTexts.txt.summary.html]) ** precision & recall ** table of true & false positives/negatives ** suggestion quality * detail tables (?): ** actual suggestions for spelling errors (?) We might want to wait with the detail tables, for at least two reasons: * they produce large tables (heavy on the server/net/browser) * they will reveale the gold standard data and can thus easily be misused to cheat (improve the spellers using the gold standard), and thus destroy our gold standards Test types in the old system: * gold standard * regression * typos * word types (morphological constructions) In the new test bench, only gold standard testing is included, whereas the other test types will be reworked as shell scripts and included in {{make check}} for spellers. This needs more consideration, but is outside this project. Data organisation in eXist: {{{ $APPROOT/data/$GTLANG/$SPELLERENGINE/$TESTTYPE/$testdate-$spellerversion.xml }}} We should add {{$GTLANG}} and {{$SPELLERENGINE}} as metadata in the xml header as well. We keep $TESTTYPE as a part of the directory structure for future compatibility in case we decide to cover other test types than gold standard testing. Development steps ahead: * get the webapp wiring in place, producing textual presentations only ** different pages, links etc. ** search possibilities * add graphs Børre and Sjur will work on the xml and the actual test bench. Until we have updated test results, use the small xml document from Monday as test result data for testing during webapp development. Webapp svn location: $MAIN/apps/divvuniskan/. ''Divvuniskan'' is also the webapp name.