Preparing data for moses: 1. ask Trond to align sme-smj senteces or use a sentence aligner (we assume that the tca2 output is ok!) smj: l100.sent.xml sme: n100.sent.xml key-file: manually_corrected_n100.sent_l100.sent.xml 2. run the stylesheet tca2cg.xsl java net.sf.saxon.Transform -it main tca2cg.xsl key_file="manually_corrected_n100.sent_l100.sent.xml" lang_a_file="n100.sent.xml" lang_b_file="l100.sent.xml" ==> output in a separate directory parallel-output/ l100.sent.txt n100.sent.txt 3. run the vislcg3-pipeline cat n100.sent.txt | preprocess --abbr=/Users/cipriangerstenberger/gtsvn/gt/sme/bin/abbr.txt | osme | lookup2cg | vislcg3 -g /Users/cipriangerstenberger/gtsvn/gt/sme/bin/sme-dis.bin > n100.sent.cg.out cat l100.sent.txt | preprocess --abbr=/Users/cipriangerstenberger/gtsvn/gt/smj/bin/abbr.txt | osmj | lookup2cg | vislcg3 -g /Users/cipriangerstenberger/gtsvn/gt/smj/bin/smj-dis.bin > l100.sent.cg.out 4. transform it into moses input format (at the moment only WORD_FORM|LEMMA|POS) ./cg2moses.pl moses-input/l100.sent.cgout > l100.sent.moses.in ./cg2moses.pl moses-input/n100.sent.cgout > n100.sent.moses.in Ready to have fun with moses!