Test ideas We want to test our transducers betters !!!Existing tests # Paradgim testing against predefined answers: yaml tests # Tests written in the lexc and twolc code # Testing whether we generate the lemma or not # Tests using the lemma list as gold standard (do we generate the lemma) !!!Ideas for new tests * Test for Multichar Symbols on the lower side * Test for phonotactically illegal strings !!!Elaborating the test ideas !!Test for Multichar Symbols on the lower side Now and then Multichar Symbols slip through twolc and give "words" like \\ {{Suome^Vn}} pro correct {{Suomeen}}. How to test for this: * Read the set of multichar symbols from root.lexc * Make a transducer {{multichar.fst}} of them * Compose {{LANG.fst .o. multichar.fst}} in xfst * list the result (should be empty) This test one should be able to set up language-independently. In case we get !!Test for phonotactically illegal strings Example, from fkv (this must be adjusted to a script): * Make a regex accepting strings in Vowel + Vowel + e: {{{ regex [ ?* [a|e|i|o|u|ä|ö] [a|e|i|o|u|ä|ö] e ] ;}}} * Compose it with the main fst: {{{ xfst -e "regex @\"src/analyser-gt-desc.xfst\" .o. @\"VVe.fst\" ; "}}} * print the result with xfst {{{ xfst[1]: print lower-words > lower-words.txt}}}