!!!Speller test result web app - first specification

Goals:
* search test results for error types
* compare test results between versions (in a broad sense, details below)
* compare languages
* graphical and textual displays of test results and comparisons
* graphical and textual display of error type search/query

Front page:
* list of all languages (dynamically generated based on available data)
* last option: all/some languages compared

For each language:
* list all available speller engines (dynam. generated based on avail. data)
* last option: compare all spellers

For each speller:
* two sets of main test results:
** precision/recall/accuracy
** suggestion quality
* list all error types (dynamically generated)
** each type is a link to a textual and graphical presentation of test results
   for that error (suggestions only or false negative (= undetected))
* query form:
** select an error type from a dynamically generated list

Version comparison:

There are several possible variables:
* different numbered versions (e.g. 1.1 vs 1.2)
* different dates of development versions (e.g. 1.3dev 2014-10-01 vs 2014-10-10)
* different languages for the same speller engine
* different engines for the same language
* all engines for all languages (the mega-big comparison :) ) (?)

Graph types:
* non-comparing graphs:
** precision & recall: vertical bar chart (one each for
   precision/recall/accuracy, 0-100%)
** suggestion quality overview: stacked bar, colours like the present graphs
** suggestion quality details: multiple stacked bars, one for each error type
* comparing graphs:
** precision & recall: as above but in groups of three, one group for each ver.
** suggestion quality overview: one stacked bar for each version
** suggestion quality details: no comparison for now (we wait with this)

Graphing done using [http://dc-js.github.io/dc.js/]

Textual data produced directly by eXist/XQuery as xhtml.

Textual data:
* summary tables (cf the present speller test result pages
  [here|https://giellalt.uit.no/proof/spelling/testing/sme/pl/goldstandard/latest-GoldstandardTexts.txt.summary.html])
** precision & recall
** table of true & false positives/negatives
** suggestion quality
* detail tables (?):
** actual suggestions for spelling errors (?)

We might want to wait with the detail tables, for at least two reasons:
* they produce large tables (heavy on the server/net/browser)
* they will reveale the gold standard data and can thus easily be misused to
  cheat (improve the spellers using the gold standard), and thus destroy our
  gold standards

Test types in the old system:
* gold standard
* regression
* typos
* word types (morphological constructions)

In the new test bench, only gold standard testing is included, whereas the other
test types will be reworked as shell scripts and included in {{make check}} for
spellers. This needs more consideration, but is outside this project.

Data organisation in eXist:
{{{
$APPROOT/data/$GTLANG/$SPELLERENGINE/$TESTTYPE/$testdate-$spellerversion.xml
}}}
We should add {{$GTLANG}} and {{$SPELLERENGINE}} as metadata in the xml
header as well.

We keep $TESTTYPE as a part of the directory structure for future compatibility
in case we decide to cover other test types than gold standard testing.

Development steps ahead:
* get the webapp wiring in place, producing textual presentations only
** different pages, links etc.
** search possibilities
* add graphs

Børre and Sjur will work on the xml and the actual test bench. Until we have
updated test results, use the small xml document from Monday as test result data
for testing during webapp development.

Webapp svn location: $MAIN/apps/divvuniskan/. ''Divvuniskan'' is also the webapp
name.