$Rev: 158353 $
Added the missing files for a working grammar checker. Fixed grammar checker
build rules to not be dependent upon enabling tokenisers.

r158289:
Added conversion of the analysis tags from the grammar checker speller into CG
format.

r158250:
One misplaced variable caused the grammar checker speller to be built
independent of the configuration. This caused a build fail for everyone. Solves
bug #2437. Also added $(srcdir) in front of root.lexc, to ensure that the file
reference resolves correctly in local build targets.

r158242:
Moved the target clean-local to the local Makefile, to make it possible to
enhance the clean target with locally generated files.

r157960:
Correctiona to the grammar checker speller build: we now build a working zhfst
file that can be used as part of the development cycle. Also additions to silent
builds.

r157879:
Major update to the grammar checker template. It still does not work completely
as it should, so hold your horses. Update content: ensured that all files needed
are copied to the grammar checker build dir, removed option to name files
(=irrelevant bloat), now builds an almost proper zip file, and ensured that
tokenisers are built before grammarcheckers. Also made it so that when grammar
checkers are enabled, spellers are automatically enabled too, as they will be
included as part of the grammar checker pipeline.

r157261:
Changed the file exists test for the lemma generation testing so that it will
work even in cases where multiple source files are used as input.

r157204:
Made cg3 file compilation more general.

r157096:
Moved the code to build the apertium relabel script in the apertium directory,
so that we can use the actual giella-tagged fst for MT as the tag source. This
should fix all issues of missing tags in the relabel script.

r157021:
GLE requires regex compilation possibilities in src/, no reason why it can't be.

r156971:
Fixed a shortcoming in the build infra uncovered by gle: no explicit support for
language-specific build rules that will not end up in lexicon.?fst.

r156319:
Moved tag extraction to a separate am-include file, so that it can be shared
between different dirs. Moved generation of regex for turning tags into CG
friendly format from src/filters/ to tools/tokenisers/filters/.

r156233:
After a couple of bug fixes in giella-core, require the new version.

r156188:
Initial support for building tokenisers where the morphological analysis tags
are given in CG format directly instead of having to be postprocess by
hfst-tokenise before being printed. The idea is to make the hfst-tokenise code
more general, and move everything that is particular to one language or setup
go into the fst instead of being hardcoded in the C++ code. There are some
issues that must be resolved, but fst-wise the code works.

r156180:
Added support for building a regex that transform all tags from the format
"+Adv" to " Adv" (including space). The idea is to make the tags readily
consumable by CG. Both prefix and suffix tags are converted. Newest giella-core
required.

r156162:
Part two of renaming the preprocess dir to tokenisers. Now all refs to it are
updated.

r156153:
Renamed the preprocess dir to tokenisers, to better describe the content of it.

r155820:
Added support for diffing and merging on Linux. As part of that added checking
for diff tools in m4/giella-macros.m4, and added more tests against failures.
Also added test for cg-mwesplit, and increased the required vislcg3 version to
the 1.0 release.

r155779:
More robust test for the existence of the various vislcg3 files.

r155748:
Added more robust option checking, and a test for the existence of the specified
corpus file. Also added some comments.

r155732:
Actually open the other diff views. And force-add to svn - we don't want error
messages in this context.

r155718:
Corrected glaring variable copy&paste bug. Thanks to Trond for spotting it!

r154835:
Removed from the default build rules the automatic removal of +Comp tags in
adverbs. That is definitely not a behavior we want universally.

r154751:
Fixed a bug that caused the check_analysis_regressions.sh script to fail if you
hadn't put giella-core/scripts/ in your path - which is not automatically done
when you just checks out giella-core and your language of interest.

r154655:
Changed command to extract the specified fst name, the old version was not
reliable.

r153095:
Due to wrong AM conditional, it still built a few mobile speller fst's. Now it
should be quiet.

r153089:
Really do disable mobile spellers by default...

r153083:
Made mobile spellers not build by default, even when enabling spellers. The
mobile spellers must now be explicitly enabled.

r152757:
Removed Ins() around Unknown. This triggered a bug(?) in hfst-tokenise, that
caused wordforms not to be output. Speed and memory consumption should not be
noticably affected.

r152167:
Improved pmatch scripts - unification by reference instead of full fst
unification. Reduces file size by ≈2/3, and runtime memory consumption by 50%.

r151497:
Now that there is a new version of Hfst out, require it. Should resolve issues
with compiling the url.lexc file.

r150101:
Further development of the analysis regression check: added support for diff
views of all diff types, and now you can specify which diff view you want to see
(and you must specify at least one). You can also override the default corpus,
and specify a corpus of your own with the -c/--corpus option. Also corrected the
initial description of the script in the help text, and added a diff view
comparing the old pipeline using Xerox with the new pipeline using
hfst-tokenise. This will help in finding unwanted differences between the two.

r150035:
Further improvements to the analysis regression check: only do function and
dependency analysis if the required cg3 files exist. Also clarified the -d
option and silenced the Xerox lookup tool.

r150021:
Improved analysis regression check script: added a short help text, and added an
option to ask for a diff between old-style (preprocess+lookup+lookup2cg) and
new-style (hfst-tokenise+mwe-disamb+cg-mwesplit) morphological analysis.
Intended to be used to find weak (and strong!) spots in the new-style
morphological analysis.

r150008:
Added the first version of a $LANG/devtools/ script that will process a corpus
with the available tools, and compare the result against the previous version in
the svn repository. The idea is to be able to easily spot regressions in
analyses due to changes in the lexicons or CG rules. There are a number of rough
edges, but it works.

r149897:
Only remove generated lemma files if the lemma generation tests succeeds.

r149609:
Only delete generated dic and tex files if one really wants to start anew. Do
not delete the version.txt file, only the generated wordlist file.

r149598:
Add the url parser also to the grammar checker tokeniser.

r149543:
Make the url.hfst a dependent of the hfst tokenising analyser. Improved the
tokeniser based on recent changes in sme.

r149455:
Removed automatic inclusion of the url parsing fst. The union with the regular
fst blew up the total, in some cases more than 10x! The preferred way of adding
it is to add it in the last steps of the *.tmp.fst > *.fst processing by loading
it onto the stack (and inverse it for hfst) before saving the fst stack, and
thus creating a transducer file with two fst's. Applying the input to them both
will in effect union them, giving the output we want without blowing up the size
of the fst file.

r149385:
Added support for compiling a lexc file for parsing URL's as such, giving them a
separate tag. Only added to the descriptive analysers for now. Requires an
updated version of giella-shared, due to the new file needed for the new
functionality.

r149344:
Corrects an inconsistency in the order of tag changing processing, where
generators and analysers got their tags changed in different order, which caused
different tags in some cases. Fixes bug #2264. Thanks to Heiki-Jaan Kaalep for
the new and corrected code.

r149190:
Updated Python feedback to correctly state that Python 3.5 is required.

r148907:
Fixed issue with link generation thanks to Heiki-Jaan Kalep.

r148504:
Increased reqiured version of Python3, due to the updated speller test bench.

r148389:
New version of the speller test bench, now with sortable table columns, and
optional timing of the suggestions for every input word (hfst-ospell-office
only). Not finished, but working quite well. It is also possible now to specify
the number of suggestions returned by hfst-ospell-office.

r147813:
Increased required version of giella-core due to bug fix in the core.

r147789:
Increased required version of giella-core due to changes in speller building.

r147702:
One more attempt at fixing the giella-common package bug.

r147651:
Added final step in building pattern-based hyphenators: now also prepared for
Hunspell-like OOo hyphenation. Requires new version of the giella-core. Also
corrected bug in checking the version number of giella-common.

r147603:
Tex pattern based hyphenation generation works. The output must be checked and
tested, and the process may have to be rerun several times to get the desired
hyphenation behavior. Removed outcommented build code from the old infra - the
new build code is essentially just a reformulation of the old one.

r147592:
Added support for checking the version of the giella-common package (aka
giella-shared/). Added two new regexes to the source file list for shared
regexes. Updated the required version of Hfst - it has not been updated in ages.

r147576:
Further work on the pattern based hyphenators: added tra file template, which is
used to 'translate'  non-ASCII chars to ascii only for the pattern creation
process. Initial build steps for the pattern build.

r147564:
Improved the fst-based hyphenator by removing irrelevant paths from the fst.
Started work on the pattern-based hyphenator, based on code from the old infra.

r147524:
Finished first version of fst-based hyphenator: now includes plain rules as a
fall-back solution (including for misspelled words), and Err-tagged forms get a
high weight penalty. In general, this seems to give good hyphenation patterns if
one pick the first (lowest-weight) one.

r147517:
First version of lexicon-based and fst-based hyphenation done. Works, but misses
capitalised words, and does not give extra weights to Err-tagged word forms.
Also no hyphenation of misspelled words yet. Hyphenation builds are off by
default.

r147509:
Added template file for weighting tags when the fst is used as a hyphenator.

r147495:
Added check for cg-relabel when enabling apertium. Thanks to Flammie for
identifying the issue.

r147393:
Added basic dir structure for building hyphenators.

r147218:
Replaced gtcore with giella-core.

r147022:
Added test dir for hyphenators, to store data from the old infra.

r147006:
Added test dirs for listbased spellcheckers, if we ever get to that.

r146815:
Fixed logical error in the handling of negated specified fst handling in yaml
tests (e.g. ~xfst) - the test didn't work, and the yaml file was run when not
intended.

r146786:
Fixed regression introduced in the previous commit: one-sided tests where
included when looking for test data, causing a subsequent python fail when no
actual test data was found. Fixed by using a stricter file name pattern.

r146741:
Added option to specify in a yaml filename that it should only be tested against
a specific technology or not, by specifying one of .foma, .hfst or .xfst before
the suffix part (before [.gen].yaml), and prefixed with '~' if negated
(i.e. .~xfst for NOT running it against Xerox).

r146706:
Slightly more robust yaml testing code.

r146700:
Common starting point for both weighted and unweighted parts.

r146325:
Added removal of Area tags also for specialised fst's. Fixes Korp issue reported
by Ciprian.

r145082:
Ensure the fastest lookup method is used during hfst yaml generation tests.

r144553:
Removed the bash hack to add a css processing instruction - it is done by the
perl script writing the xml file.

r144287:
Removed the removal for dialect and variant tags from the grammar checker
analyser, the information can be useful when generating suggestions for
corrections.

r143980:
Removed repetition of the frequency weighted fst. The goal was to promote
compounds where each part was already seen in the corpus, but it made the
speller bigger and slower, and actually decreased suggestion quality slightly.
 — Also added code to do manual priority union, but it is buggy and
outcommented for now.

r143822:
Added info about which file to look in to find a suitable frequency corpus
cut-off location (=line number).

r143635:
Renamed the option --enable-hfst-dekstop-spellers (added plural 's'), and
changed the behavior of it so that when disabled, zhfst files are still built
(and only those).

r142732:
Cleaner build steps for local speller filters - the regex is now copied in and
compiled according to the fst-format of the speller as opposed to earlier, where
the binary fst was compiled and then transformed.

r142638:
Move CmpNP processing from general speller processing to each language.

r142614:
Also moved the CmpNP filtering to the relevant languages.

r142542:
Forgot one file in the previous commit - now that filter is completely removed
from the core and template, and all language-independent processing.

r142532:
Moved the remove-norm-comp-tags.regex file from the giella-shared directory to
the languages actually using it, and consequently removed it from the
language-independent build files.

r142098:
Updated the speller devtools scripts to obey the new name and location of the
giella-core directory.

r142078:
Added test for available GNU Make, and at least at version 3.82. Error if not
found, except on OSX/macOS, where the builtin make is GNU Make 3.81 + patches,
which corresponds to the required version or newer.

r141817:
Better support for speller filters using source files from other locations.

r141652:
Added mwe-dis.cg3, to allow disambiguation of multiword expressions and other
tokenisation ambiguity.

r141536:
We build the tokeising analysers directly off the disamb and grammar checker
analysers in src/, assuming that they are identical. This is a reasonable
assumption now that the hfst tool kit contains all necessary machinery, and we
don't need to pay special attention to the requirements of the tokenisation.

r141525:
Make --with-backend-format work also for the tokenising analysers.

r141189:
Wrong variable name :-( - now it is correct.

r141182:
Corrected makefile dependency for the und.timestamp file.

r141056:
More robustness added to the test scripts: checking several variables, testing
whether the found variables are pointing to existing directories, and giving an
error message if no directory is found.

r140928:
Changed variable name and definition to allow overriding the path to the called
script, to make it easy to use a locally modified script instead.

r140921:
Changed variable name in devtool scripts, to reflect similar changes elsewhere.
Part of fixing bug #2219.

r139846:
Corrected a number of bugs and deficiencies when building spellers when the
giella proofing tools libraries must be fetched over the net. Not the spellers
build correctly under all intended circumstances given that there is a network
connection.

r139830:
Corrected path for the test for availability of the giella-common resources.

r139822:
Added support for getting precompiled proofing tools libraries across the net if
not found locally. Makes it actually possible to build spellers without checking
out the whole of $GIELLA_HOME. Now it is also possible to just check out
$GIELLA_LIBS if one still wants to build everything locally.

r139526:
Applied backend format rules to the tools/mt/ap/filters dir. This is not future
proof, but does not create problems for sme, and solves a bug in smj. The
future problem is that we mix both a specified backend format (for compilation
efficiency) with the default/unspecified format fst (for weighting) in the
same dir, and we can't automatically say which filters need to be in the
specified backend format and which should be in the default format. This
needs further consideration.

r139499:
Completely clean src/transcriptions/, and also clean tools/mt/apertium/filters/.

r139442:
Do not use PKG_CHECK_MODULES if you don't really have to - it clutters your code
and creates unneeded variables = noise.

r139241:
Corrected placeholder string for two-letter ISO language code.

r139232:
Changed the path to the css for the xml speller test results in devtools.

r139138:
Added support for building alternate orthography fst's for dictionary and oahpa,
and also morphers for alternative orthographies. Slight simplification of defs.

r139116:
One small change to support spellers for alternative orthographies built off of
the raw fst instead of the standard fst.

r139107:
Added a possibility to build fst's for alternate orthographies based on the raw
fst surface forms, instead of from the default/standard orthography.

r139057:
Changed all references to $(GIELLA_SHARED)/common into
$(GIELLA_SHARED)/all_langs.

r139045:
Rewrote the code for identifying the location of GIELLA_CORE (former GTCORE).
The code should be more robust, and is prepared to check against a pkg-config
pc file as well. GTCORE is still used throughout the code, but in parallel to
GIELLA_CORE, so that one can easily replace the former with the latter without
causing bugs or other problems.

r139018:
Added checking for and setting of GIELLA_TEMPLATES, but only if you have defined
GIELLA_MAINTAINER (renamed from GTMAINTAINER). Otherwise it is ignored.

r138923:
Revert experiment with priority union - it doesn't work as expected when weights
are involved. Corrected filenames in the .SECONDARY target.

r138908:
Added download links to the build feedbad for 'make upload' in
tools/spellcheckers/fstbased/desktop/hfst/.

r138842:
Final step to make the GIELLA_SHARED dir be found in all cases: assign the path
from pkg-config to the variable.

r138835:
Removed the separate test for content, instead adding the test to each possible
location, moving to the next location if no data is found.

r138827:
Changed the search order for GIELLA_SHARED data:
* using --with-giella-shared=/path/to/giella-shared/data/root/dir
* env. variable GIELLA_SHARED
* env. variable GIELLA_HOME
* env. variable GTHOME
* env. variable GTCORE
* using pkg-config
This way it is always possible to overtide everything else using the --with
option. Added comments.

r138817:
Added a configure test to check that there is actually data in GIELLA_SHARED.

r138781:
The giella-shared data dir is now found using several techniques in the
following order:
* env. variable GIELLA_SHARED
* env. variable GIELLA_HOME
* env. variable GTHOME
* env. variable GTCORE
* using --with-giella-shared=/dir/to/giella-shared
* using pkg-config
If all these fail, configure errors out. Since it a.o. uses GTHOME, the change
should be of no concern to existing users having checked out everything. And
since the svn location is still within GTCORE, it will also work for those
checking out only the core and a single or a couple of languages.

r138673:
Second steps in renaming and splitting the gtcore into giella-core,
giella-shared and giella-templates: replaced $(GTCORE)/giella-shared with the
Automake variable GIELLA_SHARED.

r138663:
First steps in renaming and splitting the gtcore into giella-core, giella-shared
and giella-templates: renamed variables.

r137357:
Generalised the build instructions for the morphological segmenter, aka the
morpher. The morpher output can be used as input to a stemmer.

r136448:
Fixed a bug in speller builds introduced lately - missing hfst target.

r136431:
Updated filename reference, and added a pmatch setting fixes that the issue
where words next to punctuation like "ja." don't get analysed.


r136374:
Removed '+' in front of tag patterns to be extracted from the tag list and used
as input to regex generation scripts. This was done to accomodate the use of
prefix tags, where the '+' is at the end of the tag, not in the beginning.

r136363:
Added new test to check that the speller accepts all lemmas in the lexicon.

r136280:
Rewrote the pmatch compilation code to support Kevin's tokenisation hints for
MWE-ambiguous entries. Requires Kevin's hfst fork for now. Work in progress.

r136207:
Small change to support new style, backtracking based tokenisation experiments
on space separated compounds in sme.

r136015:
The next batch of changes to support building hfst fst's with a specified
backend fst format: desktop spellers are now supported. The speller fst's will
be built using the specified backend format up to the point where corpus and tag
weights are added, when the fst format will be changed to the default
(openfst-tropical) format. That is, even if you specify (the unweighted) sfst as
the backend format, the final speller will still be weighted.

r135695:
Better variable name and clearer comment about editing distance in spellers.

r135618:
Changed the build files for the desktop spellers to allow better user control
of which files to include in the error model.

r135594:
Use priority union to avoid duplication of paths and thus make a mutch smaller
(and hence faster) mobile speller fst.

r135584:
Use priority union to avoid duplication of paths and thus make a mutch smaller
(and hence faster) speller fst.

r135221:
Fixed bug in building Oahpa fst's for alternate orthographies and writing
systems.

r135102:
Fixed a bug in the default build of grammar checker analysers. Blocked all
languages without local overrides.

r134870:
Moved removal of word boundaries out of the default, language-independent
processing of the grammar checker analyser - we want to be able to do
language-depending things with word boundaries, e.g. in freely compounding
languages.

r134780:
Added provisions for including xfscript files in the src/morphology/ directory.

r134762:
Removed unneeded subtraction that just increased the size of the resulting fst
a lot (how much of course depends on the grammar in question).

r134733:
Added initial support for doing more targeted regex replacements on multichar
sequences in parallel to the regular editdist operations. The idea is that these
replacements can be applied more times (since they are few), and thus allow for
more corrections of frequent spelling errors.

r134357:
Restricted the new spellrelax to only give one tag. The previous version caused
out-of-memory issues on a lot of systems.

r134222:
Added support for alternative orthographies in spellers. Works nicely in LO, but
needs more testing. Also updated the clean target.

r134145:
Added a new spellrelax system that will add an +Err/ tag (or more) to the
analysis of words misspelled according to the new spellrelax rules. Can be
very costly in terms of size if applied to large lexical fst's, and if many
error types are tagged, so initially it is only applied to the transcriptor
fst (which are used in Oahpa). Template data is from Plains Cree (crk).

r134126:
Fixed compilation error: added missing inversion (.i).

r133943:
Changed the final file format for hfst transcriptors to the hfstol format.

r133935:
Fixed a bug in speller building with Xerox tools enabled.

r133822:
Added support for filters for the top-level speller dir, in preparations for
needs by the Haida spellers.

r133578:
One more bugfix for tag reordering with language-specific additions.

r133568:
Fixed bug for tag reordering with language-specific additions. Made building of
glossing fst's configurable, and at the same time fixed a build bug for them.

r133537:
Added initial support for hfst-based tokenisers, built on generalisations of
Kevin's work. They are built using the hfst-tool hfst-pmatch2fst, which is the
Hfst implementation of the pmatch tool from Xerox. Supports a regular
tokeniser, and one targeted at grammar checking.

r133451:
Corrected errors in the makefile that stopped dictionary fst builds for
languages with alternative orthographies.

r133384:
Build analyser for grammar checker when grammar checkers are enabled.

r133325:
Generalised hack to force make to go via hfst instead of directly to hfstol.

r133254:
Added support for specifying backend fst format also for (parts of the) apertium
fst's. One step further to speed up compilation by specifying e.g. sfst as the
backend format. The implementation is a bit hacky, but will have to do for now.

r133243:
Added support for building glossing analysers, where the analysis tags are NOT
shifted around to canonical positions. The idea is that one keeps tags and
morphs together in the lexc code, and that the analyser output thus will
reflect the order of the surface morphs. If one wants to build such analysers,
one has to specify the final analyser filename in src/Makefile.am.

r132774:
Corrections to make the Oahpa builds work, and also to properly build with foma.

r132767:
Corrected an error that made the new option to select fst format (=backen) in
hfst non-functional.

r132760:
Now also Oahpa transducers are ready to be built with a specified backend format
when building using hfst. Also cleaned up the code, removed 300+ lines of code,
and added support for builds using Foma.

r132747:
A number of corrections to the previous commit for issues missed during first
round of testing. Now specifying an alternative backend format works correctly
for all standard analysers and generators except for Oahpa-fst's.

r132713:
Enabled the new option to specify transducer format when compiling with Hfst,
to speed up compilation time by using an unweighted format (ie sfst or foma).
Default is still openfst-tropical, until further testing is done.

r132700:
Further preparations for enabling the new option to choose the backend format
for fst's, for compilation speed improvements in cases where weight is not used:
generalisations and corrections of build instructions.

r132669:
Bummer: wrong default backend format - only openfst-tropical is stable, the
other formats are more or less buggy.

r132663:
More preparations for new configure option to specify backend format of
compiled fst's.

r132653:
Preparations for new configure option to specify backend format of compiled
fst's. Removed some old code.

r131846:
Further abstractions over parallel patterns, reducing code size.

r131125:
Remove generated files also in tools/mt/apertium/tagsets/.

r131115:
Updated required versions for Hfst and VislCG3. A number of bug fixes and new
features require these versions for many of our tools.

r130847:
One pattern rule had for some reason become ambiguous, and caused strange build
behavior. Replaced with full filename in one stable case solved the issue.

r130730:
cut on linux does not like unicode chars as delimiters, use awk instead.

r130586:
Code cleanup - moved target variables related to running xfst tools to the
xfscript include file, and thereby removing duplicate code.

r130504:
Added an option to enable building tokenisers, off by default.

r130224:
Do the CG3 tag relabelling in the Giella infra, not in Apertium.

r130128:
Forgot to rename the Area variable in the previous commit.

r130117:
First iteration of adding support for Area codes (ie countries) based on ISO
3166 codes. Right now does nothing except filtering out the tags, proper
support coming in steps.

r129977:
Better handling of hfst/xfst/foma for the top-level speller dir - invert when
needed.

r129968:
There was still one more automake file with references to the
remove-derivation-position-tags.regex filter. Now they are gone.

r129939:
A typo made reversed compose&intersect seem buggy, whereas in fact it was not.

r129915:
Small correction to bring the Giella version of reversed comp&intersect closer
to what Miikka has: added minimisation to the reversed twolc rules.

r129867:
Added configure option to reverse the lexicon and the morph-phon rules during
composition and intersection. Reduced the time needed for that operation to
≈1/3 of what it used to be in SMS, and RAM consumption went down from 11Gb
to max 400Mb! Speed and RAM gains will vary from language to language.

r129841:
Now the lexical fst is first compiled into a .tmp file, to allow
language-specific changes to be applied from .tmp to final file. More
support for xfscript compilation.

r129823:
Added include to xfscfript-include.am, to let xfscripts be used in lexc
compilation.

r129796:
Second part of libdir cleanup: removed the libdir line in the pkgconf file.

r129790:
libdir -> datadir for zhfst installations using autotools.

r129752:
Removed some references to remove-derivation-position-tags.regex that were
forgotten in commit r129657.

r129722:
Added analyser-disamb-gt-desc.hfst as a noinst_DATA target, to force make to
build it instead of going directly to the *.hfstol file, and thus breaking
compilation when local modifications are needed.

r129689:
Added INVERT_<FSTTECH> variables to help in improving compilation of analysers
and generators for different fst technologies (Xerox, Hfst, Foma). Hfst has the
inversed convention for lookup compared to the other two, and by using a
variable we can now actually share the same build code irrespective of which
one we need to inverse for the final analyser or generator.

r129657:
Removed the remove-derivation-position-tags filter from language-independent
processing, it is language-specific, and will be added to the languages needing
it. This also makes it possible to do further local processing dependent on
these tags.

r129627:
Error out if Hfst is requested but not found or too old.

r129095:
LibreOffice-voikko 5.0 support for spellers with alternating writing systems.

r129044:
Initial support for building zhfst files for mobile phone keyboards. This
version is essentially the same as the desktop one, we'll start from here and
adapt as we find better solutions. The zhfst file is compressed using xz for
optimal file size (this is presently in violation of the zhfst specification,
it must be updated soon). Also changed some of the configure options to error
out when requested but without the required software installed - this is better
than silently turning the requested feature off.

r129031:
Cleaned the speller build and configuration code in preparation for adding
support for building mobile spellers.

r129012:
Added a missing SUBDIR, and fixed a speller test script that was not working.

r128994:
Updating path in test script.

r128983:
Corrected a miss in the previous commit.

r128981:
Moving the hfst speller test dir inside
test/tools/spellcheckers/fstbased/desktop/.

r128977:
Preparing to reorganise the speller testing parallel to what has been done in
the development dir.

r128972:
Updated path to desktop speller files.

r128934:
Corrected a few misses in the previous commit.

r128931:
Major reorganisation to support building zhfst files for mobile systems (aka
keyboard + speller). These need very different weighting priorities, another
error model, and are thus placed in a separate subdirectory from desktop
spellers.

r128901:
First step in adding support for mobile phone spellers.

r128872:
Added support for building LO-voikko 5.0 extensions. Python-based interface to
LO, and initial support for specifying unknown speller languages by typing in
the language code in the language name field.

r128639:
Commented out xz compression, it isn't supported by libvoikko.

r128072:
Changed test pair conventions for twolc from !€/!$ to !!€/!!$ to make it follow
the conventions in the rest of the infrastructure, and make it possible to
include test data in the documentation.

r126367:
Readded the initial-letter edits in the regex - everything else is there for
the initial letter machinery, so leaving it out made the build inconsistent.
The default is off, with a large warning for those turning it on.

r126340:
Added script to run suggestion testing for the hfst-ospell-service (MS Office)
speller. Rewrote the speller testing scripts to allow parallel execution.

r126115:
Make transitivity tags optional also for the Apertium generator.

r126072:
Push weights even when not minimising the speller acceptor. Minimisation is not
always the best strategy.

r125928:
Removed --Werror from the language-independent automake file. Added a variable
to make it possible to add it to the language-specific automake file.

r125918:
Added configure option to enable symbol alignment during lexc compilation for
the lexical transducer. Defaults to off for now, we need to test the effect on
various languages before making it default to on. Also added --Werror to lexc
to make it break on all warnings when compiling the lexical fst.

r125889:
Use tar + xz for a 40-50 % reduction in file size for zhfst files.

r125801:
Allow longer filenames by using tar-pax for make dist.

r125756:
Added upload target for zhfst files. That will be the only method for spell
checking in more than one language for now (for regular users). Not ideal, but
have no time for anything else.

r125485:
Ensure that all required cg3 files are copied over to the apertium dir. Also
make sure that included files are copied before including files are processed.

r125470:
Silent build updates for Apertium.

r125444:
No morphology backend for now in our infra. Corrected typo.

r125405:
Added support for the vfst fst format for voikko-based spellers, to be used in
mobile apps.

r125348:
Corrected typo.

r125331:
Upload xpi and MacVoikko files, beta versions.

r124945:
Look for saxon in $HOME/lib first. Fixes bug
http://giellatekno.uit.no/bugzilla/show_bug.cgi?id=2100.

r124920:
Add lexicon version to the speller testing output.

r124883:
Added a new variable HAS_FOMA, which will be set independently of the
configuration if foma is available. This can be used to circumvent bugs in
Hfst if weights are not needed: if foma is available, print as ATT, read in
foma, perform transformations, print as ATT, convert, and continue.

r124785:
Error out if one tries to build abbr files with generators disabled.

r124773:
Error out if syntax is enabled and no vislcg3 is found or too old.

r124730:
Added support for building abbr.txt. Copy of the sme template committed in r111579. Hopefully fixes bug 2030.

r124428:
Added targets for foma spellers, outcommented now due to build issues. Added
more silent build strings.

r123664:
Added some general tag cleanup before making the speller fst used as input for
the analyser and generator that is the last step before building the acceptor,
Makes it easier to write yaml tests for the speller fst's.

r123037:
Added filter to remove tags irrelevant to speller builds. Adjusted required
version of GTCORE accordingly.

r122780:
Corrected a bug with filter compilations for speller filters involving tag
conversion to flag diacritics.

r122724:
Make sure analyser-raw-gt-desc.hfst is always built, to ensure we have the
necessary prerequisites for all targets. Refactored the initial speller fst
build to use common build code for all fst technologies. Makes it possible to
easier test and compare test results when debugging.

r122125:
Changed the response to missing transducers from FAIL to SKIP to avoid problems
with lexc tests for fst's not enabled and thus not available. Instead report
the missing fst to the user.

r122053:
Streamlined descriptive compounding tags to follow a shared tag structure.

r121755:
Added a comment about the non-functioning of the initial edit setting. Made the
compound-restricted fst a tmp file, to allow for additional local processing.

r121729:
Removed all minimization of the error model except for the final build step.
Removed also the initial letter handling for now, it blows up the error model,
and slows it down correspondingly, making spellers that has turned this on
useless. For now we apply the regular error model on the first letter, that
seems to work ok.

r121564:
Added a very short test script written by Lene to help run a subset of tests
frequently needed.

r121514:
Fixed a problem running bc on the linux servers, which caused the yaml test
summaries to be blank.

r121418:
Added an option to specify how many lines of the frequency corpus to be used in
the frequency weighting, to trim the acceptor fst at a point where the weights
don't really matter. Removed all occurrences of remove-epsilon, determinisation
and minimisation of intermediate speller fst's - this cut the size of the final
acceptor in two!

r121218:
Replaced 'giellatekno' with 'giella' or added Divvun, depending on context.

r121210:
Renamed m4/giellatekno.m4 to bring it in line with the switch to 'giella' for
all things common to GT and Divvun.

r121204:
The previous commit did not solve the issue - the different jars where checked
in the wrong order. Now it should be ok.

r121189:
Added standard Linux location for Saxon to the paths searched. Fixes bug #2080.

r121124:
Corrected path for pkgconfig data and one variable name in MT filters.

r121097:
gtdshared has been renamed to giella-shared, all references now updated.

r121060:
More robust handling of MWE in speller testing. Now also possible to specify
build dir different from source dir.

r121050:
Added a Makefile.am variable to turn on or off corpus-based (frequency)
weighting of suggestions. Default for the time being is off while we work out
the best interactions between the different parts of the spellers. Changed one
intermediary filename to ensure proper dependency checks and thus rebuilds.

r120994:
Added support for specifying regexes or list of string pairs for initial and
final symbols in the error model. Also added a Makefile variable to control
whether to allow edits of the initial letter(s), default is ‘no’.

r120628:
Guard against -q for lookups that don't support it in test scripts.

r120599:
Small code cleanup that has been lingering since June.

r120420:
Made new of a new option for the speller suggestion testing: output an attribute
on each test word element containing essential info about the correct
suggestion. This will support better styling of the xml file with the test data.
Also changed the path to the css from the local filesystem (which will vary from
machine to machine) to the svn repository web url.

r120182:
Added a variable to hold source files to be included in the distro but not
compiled as such.

r120120:
Added first version of a shell script to check the suggestions generated by
spellers. Requires the file test/data/typos.txt for data input.

r120070:
Shortened a filename to make tar happy when building distribution packages.

r120030:
Fixed an error in distcheck - one test shell script was not included.

r120005:
Made one step in the speller build behave properly wrt silent builds. Removed
grammar checker targets, we are far from ready for this, and it breaks
'make distcheck'.

r119950:
Added a variable to pass a compilation option to hfst-regexp2fst. Used this
variable to compile all filter regexes with the option --xerox-composition=ON.
This will ensure that all filters where flag diacritics are used as symbols will
be compiled correctly for proper used in later compositions. A.o. this fixes a
bug where tags converted to flags to restrict compounding did not work at all.

r118834:
Replaced sed expression with double cut - the sed did not work on the xserve for
whatever reason, and caused the testing to hang.

r118809:
More robust checking of Saxon, now requires that any jar found is at least v8.0.

r118657:
Added /usr/share/java/ as a search path for the Saxon jar, this is what is used
on the UiT Linux virtual machines, and probably many other Linux systems.

r118602:
Initial support for building Mozvoikko spellers for our languages.

r118333:
Adding support for specifying one-sided tests (half tests) in the lexc test
data, using an optional .gen or .ana "suffix" after the fst name. Simplified
source file processing.

r115589:
When building with Foma, use the new lexc-align feature.

r115439:
Added lexicon filtering when pair-testing twolc rules.

r115155:
Corrected e-mail address, changed the template content of the transcription
files from SMA to CRK, and at the same time corrected the direction of the code.
Also added a default punctuation lexicon.

r115069:
Added support for easter eggs specific to alternative writing systems and other
variants. Will help in debugging.

r114915:
Moved specification of default weight and editing distance to the language
specific Makefile.

r114904:
After a lot of experimenting, a moderate set of changes to the speller error
models. The biggest change is that the alphabet for the edit distance error
model is not taken from the acceptor anymore, but must be explicitly listed in
the editdist.*.txt file. The suggestion speed is back to normal, but more work
is needed re the interaction of the error model and corpus weights.

r114860:
Prefixed all silent build strings for Hfst tools with H, for easier
identification.

r114468:
Commented out the old target for calculating unit weights (default weight for
out-of-corpus word forms), and added a new which is basically the highest
tropical weight + the ALPHA smoothing value. This is just the first step in
further developing the suggestion ordering for the spellers.

r114332:
Added a simple test to check a minimum suggestion speed for our test word
nuvviDspeller. No speller should be released that does not pass this test.
Additional and more elaborate tests should be added as well, this is just the
very bare minimum in suggestion speed testing.

r114310:
Corrected typo in twolc compilation for foma (using hfst).

r114286:
Worked around a bug in hfst-fst2fst by going via att and foma instead.

r114281:
Initial support for compiling twolc files for foma by way of hfst, intersect and
conversion to foma format.

r114124:
Yaml testing is now working also when building with Foma.

r114105:
Fixed downcasing of derived short names. Made yaml testing output a bit more
readable (hopefully).

r113871:
More robust xfscript build code for hfst-xfst. Clean hfstol files.

r113770:
Extended Foma support to alternate writing systems and orthographies. At the
same time put to use a new idiom to handle multiple independent target
variables / patterns, which will be useful in other contexts as well. The code
was generalised using this new idiom, and effectively reduced to half the
original code size, with much less duplicate code.

r113520:
First working version of foma builds. The basic set of analysers and generators
are built, but nothing else. A lot of changes to variables and build rules,
including generalisations that save quite a few lines of code.

r113372:
First steps to support building with Foma. Lexicon compilation is working, but
note that Foma crashes on regexes in lexc.

r111916:
Slightly more robust pair-testing with hfst.

r111898:
Corrected rsynk options also for the alternate writing system oxt's.

r111883:
Fixed a build bug for MacVoikko, causing the final target to always be out of
date. Regulated verbosity for zhfst targets. The twolc testing scripts now
print a message when there is no test data. The Hfst twolc testing script
properly detects when there is no test data, and exits with the SKIP (77) value.

r111786:
Added support for alternate writing systems for spellers.

r111627:
Finally got all weighting to work as intended, including the no-sugg weights.

r111511:
Further modularisation and improvements to weighted spellers. With hfst3
revision 4329, using a tab-separated tag reweighting file is working.

r111192:
Do not remove usage tags when building spellers, speller tags were throwned out.

r111179:
Added an attempt at normalising the corpus-based weights towards a standard max
upper weight, to allow a much higher weight for strings not to be suggested.
Also split the processing of adding corpus-based weights and morphology weights
into more steps - retaining each intermediate fst - to allow easier debugging
of the weight assignments.

r110944:
Xerox composition of weights and lexical fst.

r110791:
Moved a script for cleaning weighting corpus to the core. Require new core.

r110770:
Fixed bugs related to the new support for frequency-weighted spellers: missing
checks for required tools.

r110758:
Stupid copy-paste error turned the positive test into a negative. Now corrected.

r110747:
Skip Xerox testing if no test data is found. Added comments.

r110734:
Added pair-test for hfst, improved pair-testing for Xerox' twolc.

r110661:
Add a huge weight to words tagged with +Use/SpellNoSugg.

r110646:
Added support for corpus-based (frequency) weighting of the speller fst's. Also
reorganised where to specify the tag-based weights (and this is subject to
change pending a bug fix in hfst-reweight). All languages are given a toy
corpus, which can be replaced with a real one. This is finally the core of
Tommi's dissertation applied to all languages.

r110513:
More robust testing for Xerox fst's - will properly report all generation fails.

r110464:
Corrected tests for nouns and propernouns. Now nouns behave correctly with hfst,
and proper nouns have correct tags.

r110331:
Modernised the generate-noun-lemmas.sh.in script, added similar scripts for adj,
proper nouns and verbs.

r110301:
Check that yaml testing is enabled before running yaml tests in test/tools/.

r110208:
Require new version of the core, updated comments about Err tags.

r109663:
Removed CmpNP tags from downcase-derived-proper-strings.xfscript.in.

r109310:
When doing 'make clean', remove generated html files in the root dir.

r109253:
Removed multichar definition of superfluous flag diacritics.

r109225:
Added a new directory named devtools/ to each language, with the idea that it
should contain tools useful for development, but not necessarily suitable for
automake testing. Initially it contains shell scripts to generate a table of
generated word forms for each continuation lexicon.

r109112:
Removed corpus names from tools/spellcheckers/fstbased/hfst/data/Makefile.am. It
caused the build to stop with an error for all languages except FIN.

r109098:
Make building the abbr.txt configurable (default=no), check for the existence of
src/morphology/stems/abbreviations.lexc, and error out if not found.

r109092:
Forgot to include the new Makefile (r109076) in configure.ac.

r109077:
Path correction.

r109076:
Preparations for supporting corpus-based frequency weights, as per TommiP.

r109063:
Enabled weighting of speller fst's. Adjust weights and tags as needed.

r108914:
Added support for all languages to generate the abbr.txt file used by
$GTCORE/scripts/preprocess. At the same time added initial support for
compiling pmatch scripts into fst's for hfst-proc2, which is the future
alternative to preprocess.

r108859:
Forgot to remove some debug statements from the yaml test runner. Now cleaned.

r108840:
Moved MWE tag processing into the core - we want this for many languages.

r108818:
Added support for a new type of yaml tests: speller acceptance testing. The
basic idea is to just give a list of words and word constructions (compounds,
derivations, etc) the speller should accept or reject, and let the yaml test
bench verify whether this is actually the case.

r108755:
Several changes to properly support all position-based +CmpN/XX tags:
* moved tag path splitting and tag-to-flag conversion into separate regex
  files in the core.
* added support for compiling and using the new regexes
* added support for a new type +CmpN/Suff
* added the required multichar symbols to the root.lexc files
* increased required core version number

Fixed a bug in the yaml test bench when both hfst and xfst was enabled, but
where only one type is built, e.g. for Apertium.

r108692:
Added build support for alternate orthographies: default fst's, dicts and oapha.

r108675:
Fixed a bug that caused the wrong fst to be picked in certain cases, which
caused the test script to fail.

r108646:
A couple of changes related to testing:
* require Python 3.3+
* require new gtcore
* update YAML test runner to make SMS testing work as intended also with Xerox

r108560:
Added support for country/region specific proofing tools in configure.ac.

r108404:
We do not support anything but the latest/newest Voikko now.

r108395:
Finalised the basic multiple writing system support, by adding support for Oahpa
and dictionary fst's.

r108384:
Added a configuration flag to enable two-step compose-intersect. In most cases
this will not make any difference, but for some languages it will correct a
bug in compose-intersect that would otherwise create a bad fst, and for other
languages it will make the operation much slower without changing the fst.
Disabled by default, whether it is useful must be tested in each case /
language. Also made the verbosity handling such that when verbosity is
on (V=1), some tools are now more verbose, for better help when debugging.

r108355:
Corrected errors in hfst compilation of alternative writing system fst's.

r108322:
Added test runners for genation and analysis tests only for the descriptive fst.

r108289:
Compilation of the default set of fst's with alternate writing systems working.

r108187:
First step in adding support for alternate writing systems and orthographies:
adding variables to configure.ac. Removed the variable LO_min_version, it
isn't used.

r108134:
Split the m4/ax_python_module.m4 file, it contained mostly java autotools stuff.
Improved the message to update the gtcore.

r107335:
Added the make-optional-hyph-tags filter to the generators. Fixes bug #1914.

r107310:
Make use of the new remove-adv_comp filter. Require new core and newest hfst.

r107278:
Put to use the make-optional-adv_comp filter.

r107272:
Don't build xerox fst's within the Apertium dir tree - no need for it.

r107222:
Require new core because of new filters. Use hfst-optimized-lookup in the yaml
testing, should speed up hfst testing quite a lot.

r107150:
Put the new optional minip filter to use, and increased required gtcore version.

r107122:
Replaced all instances of sub and lexsub filters with the new, generated error
filters.

r107115:
Added support for extracting error tags and constructing filters for
manipulating error strings and tags. Updated required version of gtcore.

r107106:
Remove variant tags in disamb analyser.

r107047:
Xerox fst's are irrelevant to Apertium, don't even try to build them.

r106998:
Use the new make-optional-v1-tags filter for apertium generators.

r106982:
Forgot to include the new regex in the src file listing in the previous commit.

r106973:
Corrected dictionary generators to require a variant tag except for +v1, which
is optional.

r106966:
Removed 'invert net' from a couple of more instances.

r106957:
Treat Hfst and Xerox the same during *tmp.Xfst and *.Xfst build - invert both
only in the last step when going from tmp to non-tmp fst (invert the analyser
for hfst, the generator for xfst). This should remove one more confusing
difference between the two.

r106951:
Check that we have at least Python3.1 when enabling Apertium, error out if not.
Also add AM check for hfst-optimized-lookup.

r106451:
A small, functionally equivalent change: from suffix rule to pattern rule.

r106428:
Now +CmpN/Pref is correctly supported (earlier it was treated as +CmpN/First).

r106402:
Corrected fst file reference in test shell script.

r106398:
Corrected source file reference in test shell script.

r106356:
Changes to a couple of Makefile.am files to fix issues with 'make dist'.

r106346:
The last part of the CmpN location restriction flag diacritics added.

r106245:
Code cleanup: no use for the M4 part - the null alternative did not work.

r106226:
Finally nailed all combinations of fst compilator and lexicon minimisation - now
downcasing of derived proper nouns is working as it should again for both Xerox,
Hfst hyperminimised and Hfst normal lexc compilation.

r106160:
+CmpN/Only supported, first steps in tag splitting taken.

r106122:
Moved code common to all yaml testrunner shell scripts to an include file in
GTCORE to avoid code duplication and reduce the risk for introducing bugs.
This requires the newest version of the CORE. Because of the inclusion, I had
to rename the test runner to .sh.in, and added autoconf processing of it. Also
added a test file for testing the base speller fst (it must be tailored to each
language of course).

r105950:
Last change to get hyperminimisation to produce the correct output: made the
derived-proper downcase script being processed by autoconf, so that we can
require a symbol in a certain context, and at the same time in the end let
the symbol be empty if not needed.

r105935:
Added optional flag diacritic inserted by Hfst hyperminimisation. This resolves
the remaining cases of errors after the hfst team fixed a bug in lexc
compilation with hyperminimisation turned on. Since it is optional, it does not
make any harm when using Xerox or when not using hyperminimisation.

r105926:
Added xerox variable flag-is-epsilon to the tag reorder regex. This fixes most
of the cases of errors after the hyperminimisation bug was fixed in hfst-lexc.
The remaining errors must be fixed in the downcase-derived-proper regex.

r105715:
Added more silent builds for hfst tools.

r105673:
Added conversion of tags to flag diacritica for position-restricting tags. These
are currently used in sma, sme, smj and sje. Together with some additions to the
R lexicon, the tags will finally do what they are meant to do for hfst-based
spellers.

r105616:
Added Multichar symbol definitions for flag diacritica controlling compounding
based on position tags. Done for most langs, the symbols will be ignored if not
used.

r105496:
New: added example test file for the fstspeller fst file (starting point for
foma and hfst spellers).

r105492:
Fixed: errors in the yaml test runner when the fst has a suffix 'hfst'.

r105488:
Fixed: directory and fst names in the yaml runner shell script.

r105484:
Added support for yaml tests for speller fst's.

r105438:
Added support for Xerox fst's in tools/spellcheckers/fstbased, mainly to help in
debugging hfst. Turned out to be very useful. Why can't none of the toolsets
work properly?

r105424:
Improved comments to make the lemma generation script easier to adapt.

r105390:
Additions to generate the inverted fst's, to enable symmetric yaml testing.

r105382:
Fixed: order of filter application was wrong, causing all Use/-Spell forms to be
included in the spellers.

r105286:
Fixed: error in easter egg building after the previous commit.

r105284:
Make sure the easter egg is rebuilt every time the fst is rebuilt.

r105238:
Fixed: The MacVoikko target contained one subtarget that built even when
spellers were not enabled, and thus failed because of a missing dependency.

r105201:
A number of changes to make the MacVoikko.service build cleanly with proper
dependency tracking. Also a bit safer cleaning.

r105194:
Fixed: The MacVoikko target was missing from noinst_DATA, thus it was not built.

r105104:
Added initial support for building language-specific macosx systemwide spellers.

r104185:
Added strip function to get rid of extra spaces, resolves bug in abbr.txt build.

r103029:
Included lexc files in src/morphology/ in the abbr file making.

r103027:
Expanded the source file base for building the abbr file, more like the old
infra.

r102952:
Only delete (aka 'make clean') generated corpus files used for weighting if
such files exist. Removes a very dangerous 'rm -rf .*' command.

r102809:
Fixed bug in the phonology building that caused extra source files not to
be compiled.

r102678:
Removing Use/LexSub strings from all normative fst's. Fixes bug #1904.

r102214:
Added support for turning off building of vislcg3/syntactic tools.

r101825:
Improvements and corrections in the README file.

r101818:
Changed Hfst configuration:
* moved xerox check before hfst check to ...
* automatically enable hfst if the Xerox tools are not found
* moved minimum version requirement definition to configure.ac
* removed hfst-foma requirement, instead checking for all required tools
* removed path check for obsolete hfst tools
* improved hfst configuration messages
* updated the summary text to reflect that hfst is automatically enabled
These changes should ease configuration on systems without Xerox.

r101729:
Corrected names of compiled twolc files in test/src/phonology/pair-test*.sh.in.
We need to use the 'compose' fst because compiled twolc files are not treated
the same as other fst's. We can't just skip the new lookup friendly filenames
either, because morphophonological rules can be written using xfscript, in
which case the lookup renaming (and inversion) is essential.

r101575:
Corrrected references to the new lookup style fst names in the inituppercase
test. Fixes broken inituppercase tests. Updated config header in initcap yaml
file correspondingly.

r101554:
Now both general and language-pair specific relabelling using regexes
are supported, in addition to using relabel files. The regexes allow
context-dependent and multisymbol changes, whereas the relabel files only
cover 1:1 mappings of single symbols. The actual change was to add support
for regex files in the language-pair independent processing. The
tools/mt/apertium/tagsets/README.txt file was more or less completely rewritten
to better document the filenames being recognised, and how they should be used.

r101434:
Retain the regular non-optimised hfst analyser for easy paradigm generation
using a regex plus composition.

r101405:
Fixed a bug in the Apertium build that blocked building of AP-tagged analysers.

r101363:
Make sure there is always an apertium analyser for 'und' if nothing else.

r101193:
Do not remove homonymy tags from the apertium fst's. Also simplified the
automatic conversion by moving all non-automatic changes to a separate file,
run as a sort of tag conversion postprocessing. Updated the tagset/README.txt
file to contain info aobut the manually maintained postprocessing relabel file.
Added an initial postprocessing relabel file containing word boundary and
homonymy tag changes.

r101189:
Do not remove homonymy tags from the regular analysers.

r101162:
Fixed a bug in building Oahpa generators - orig-lang tags were not removed.
Clean *.hfstol files in tools/mt/apertium/.

r101043:
Moved Apertium tagset creation and relabeling from src/tagsets/ to
tools/mt/apertium/tagsets/. This should fix building of apertium fst's for
fin, smn.

r100989:
Renamed AWK to GAWK in relevant places to get around another AWK test. Now gawk
is found properly in all cases.

r100985:
Improved test for gnu awk.

r100878:
Require newest core to force people to upgrade to get an important bugfix.

r100786:
Fixed a bug in the core for generated regexes - a reserved char was not escaped.
Required core version bumped.

r100719:
Hfst 3.8.0 is out, with a number of important bug fixes and improvements,
including new options required to make our code build properly.

r100565:
Several changes to accomodate a downcaseerror variant of the L2 error fst for
Oahpa:
* added configure.ac option --enable-downcaseerror (independent of the L2 opt)
* a number of changes to the build instructions for Oahpa to support the new fst
* made the error fst compilation independent of whether an L2 twolc/xfscript
  file is used - if not, it will just use the ordinary twolc/xfscript file.
  This way it is possible to 
* svn-copied regexes from the old to the new infra, including to the core
* increased gtcore version number and required version number due to new regexes

r100542:
Corrected wrong filenames and file references that blocked the oahpa L2 build.

r100538:
Tagset relabeling didn't work for xfst files, now it does. Also generalised the
use of relabel files (for use with hfst-relabel).

r100499:
Simplified the building of hfst's with alternative tagsets. Silenced regex
compilation.

r100478:
Last part of the lookup & composition cleanup: phonetics and phonology now
covered. Now all non-lexical and non-filter files have a suffix .compose.*
or .lookup.* depending on their intended use, and they are all properly
inverted where needed (i.e. only for Xerox' lookup tool). There might still
be source files to clean, but that is a separate step.

r100468:
Corrected a couple of cases where old filenames were still used, and thus
broke compilation. Also improved filtering of transcriptors, and constructed
transcriptor target names dynamically based on the source files.

r100453:
Xfscript and lookup cleanup: now we explicitly build files made for lookup and
composition marked in the filenames. This is done for hyphenation and for
orthography, phonology and phonetics still to be done. From now on there should
be no need to use invert as part of the xfscript code - DON'T DO IT! All
targets updated to use the new filenames. Removed inversion from the
hyphenation xfscript.

r100424:
Use explicit pipe mode with hfst-xfst.

r100413:
Moved Apertium target language specification from configure.ac to
tools/mt/apertium/Makefile.am. Changed the target filename construction
to better follow the Apertium naming scheme. Fixed a bug introduced about
four weeks ago that destroyed the dependency chain (due to a bug/fragileness
in GNU make).

r100356:
Cleaned up building of target fst's using the lookup-include.am file. Now all
hfst transducers in optimised lookup format have the suffix .hfstol, and
optimisation should not be hidden or implisit anymore. All test scripts should
be updated as well. Also move all common targets from src/Makefile.am to
am-shared/src-dir-include.am and sub-included AM files. This cleans up the
src/ dir Makefile.am quite a lot.

r100346:
Added support for additional local lexc files not part of the lexical fst.

r100126:
Several changes to clean up the mess with the transcriptors:
* moved transcriptor final builds from src/ to src/transcriptions/
* renamed transcriptor source files and targets
* streamlined transcriptor compilation to use lexc-include and lookup-include
* also silenced xfst in lookup-include.am

r100035:
There were a couple of issues in the previous commit:
* vpath directive didn't work reliably
* L1 and L2 variabless were declared for easy merging, but in a way that
  AM didn't like
* forgot to change the name of the lexical fst in the filter processing

r99883:
Several fixes to accomodate L2 (language learner) analysers for Oahpa:
* removed silent build instructions from twolc-include (they are taken from the
  silent-build-include instead)
* added support for compiling L2 phonology/twolc files when configured to
* renamed $(GTLANG)-lexc.?fst to just lexicon.?fst.
* added support for the error analyser in src_oahpa-include.am
* added configure support for the L2 analyser (off by default)
* added support for building the L2 lexical fst using L2 source files
* added variables a.o. to support specifying L2 source files in src/morphology/

r99793:
Added support for filters written in lexc and xfscript. Renamed variables and
added a lexc-include.am file to support general lexc compilation.

r99665:
Fixed an unfortunate AM syntax error that blocked Automake, and thus all builds.

r99587:
Another filter build cleanup: all filter regexes in core are now built for all
languages. One obsolete filter was removed.

r99584:
Fixed a problem with MT filter compilation that only revealed itself in sme.

r99579:
Cleaned the filter build files even more. Now only local / language specific
regex source files need to be listed in the local Makefile.am.

r99574:
Added a new filter to the filter compilation. Used the new filter to build
correct fst's for dictionary analysis and generation. Increased the version
number of the required gtd core version, due to the new and required filter
in the core.

r99544:
Major cleanup of filter and tagset compilation:
* moved all non-local data and build instructions into am-shared/
* created dir-specific am-include files
* clean use of regex-include.am
* removed sme-specific source files from tools/mt/apertium/tagsets/Makefile.am
* switched the apertium filter use to use the one built in src/filters/ instead
  of rebuilding it

r99473:
analyser-oahpa-gt-desc should be analyser-oahpa-gt-norm. Now renamed.

r99462:
The listbased speller fst is now generated properly using both Xerox and Hfst.

r99451:
Fixed a logical error that turned off all hfst spellers. Renamed a variable.

r99445:
Only build Apertium tagsets in tools/mt/ if Apertium is turned on.

r99425:
Corrected a syntax error in the src_disamb-include.am file. Moved all fst
trimming of general interest from tools-spellcheckers-listbased to
tools-spellcheckers. Made the configuration so that list-based spellers will
only compile if configured to build Hunspell. Also tried to make the
configuration of other spellers such that they are automatically off
when spellers are off.

r99366:
Batch two of the initial letter downcasing fix.

r99350:
Downcasing of the initial letter of derived proper nouns (Pariisi ->
pariisilainen) is now finally working with Hfst. It requires Hfst svn rev. 4000.

r99221:
The first major step for adding support for generating list-based spellers such
as Hunspell and the PLX (Polderland/MS Word) spellers. The conversion is not
trivial, since we try to control compounding according to the linguistic
specifiation in the lexicon (using tags). Although PLX is only for three Sámi
languages, Hunspell conversion should be useful for all languages in our
infrastructure. No real Hunspell or PLX files produced yet, only prerequisite
fst's. - At the same time fixed a glitch in the version checking of VislCG3
that would turn off support for CG files now that the vislcg3 svn revision
number has turned 10 000.

r99176:
Added support for local overrides of the base speller fst.

r99109:
Generalised and simplified the code for building oxt's - no more hard-coded
filenames. Now the LO-voikko versions supported as well as the platforms are
just defined in two variables, and the rest follows from there. The build code
also handles cases of unsupported combinations of voikko versions and platforms.
Also silenced the build quite a lot in non-verbose mode.

r98986:
Switched to universal binary build for the LO41 voikko OXT.

r98767:
Made the hfst optimised lookup file format explicit by using the .hfstol suffix,
and by optimising files for lookup in a separate build step, instead of
implicitly as before. So far only for tools/mt/apertium/, but more will come.

r98696:
Made speller minimisation default to yes, specified where to push weights.

r98671:
Added --encode-weights to determinise and minimise. This fixed the never-ending
compilation of Finnish spellers.

r98633:
The optimisations that worked for Greenlandic didn't work for Finnish,
potentially due to Finnish being corpus-weighted and thus posing more
challenges to determinisation and minimisation. Because of this the
Greenlandic optimisation is now rolled into the configuration
option --enable-minimised-spellers.

r98616:
Added size and speed optimisations to the speller compilation process:
remove-epsilons, push-weights, determinise and minimise. Together this made
the KAL speller *much* smaller and *much* faster. It is now as fast and small
as any other fst-based speller.

r98563:
Hyperminimisation seems to be stable now, and we offer it as a standard
configuration option. Also added autoconf support for the preliminary tool
hfst-proc2, to facilitate easier testing of the tokeniser/analyser.

r98486:
Updated the tagset targets to support Xerox fst's, and tagset replacement using
regexes instead of the hfst-only relabel tool. Now all languages can get
localised analysis and generation tags by adding a regex file and specifying a
few targets.

r98469:
Added build step to explicitly convert hfst transducers to optimised lookup
format. Whitespace changes in the silent rule variables. Included the new
lookup-include file in src-dir-include.am.

r98459:
Preparations for better handling of lookup & testing of free-standing lexc and
rewrite rule transducers: added build rules to do inversion of fst's intended
for lookup.

r98454:
Added a test dir for the upcoming hfst-based tokeniser.

r98323:
Corrected some paths to enable VPATH building of spellers. Added support for
retaining intermediate files when building using "make --debug".

r98165:
Added support for building OXT for LO/OOo 3.6-4.0 for Mac. Language support is
limited.

r98043:
Properly clean src/morphology/.

r98034:
Encapsulated most shell variable names in {} to handle hyphens etc in the
variable names (after merge/update substitution of the __UND__ string).

r98024:
Added a dir src/morphology/generated_files/ containing files generated during
the build process. This is done to make a clear separation between files to be
edited and files to be ignored. Also added a directory src/morphology/incoming/
to hold incoming lexical resources used to build the lexc or xml source files.
Both dirs have a 00README.txt file explaining their use.

r97047:
Added WANT_OAHPA option for analysers for all languages (until now only
generator)

r97041:
Added oahpa analyser as target (oahpa here meaning L2 transducer)

r96645:
Fixed a bug in sigma extraction on certain Linux systems.

r96604:
Better/more generalised handling of tag modifications.

r96280:
Added removal of lines marked '#RemoveFromApertium' from the apertium cg3 files.

r96100:
Removed temporarily the downcasing of derived proper nouns from the hfst
transducers - it causes them to become malfunctioning.

r96043:
Fixed bug introduced yesterday that broke compilation of certain xfscript files.

r95955:
Forgot two files in the previous commit.

r95954:
Added support for testing the fst for initial upper casing of strings. This also
includes yaml test support for non-analysing/-genereting fst's.

r95796:
Properly handle downcasing of derived proper nouns as well as optional initial
upper case. The optional initial upper-casing doesn't work for derived proper
nouns when using Hfst because of an unimplemented featuare in hfst-xfst. It is
reported to the hfst team. Increased gtd core version number due to new scripts
and possible dependencies in the gtd core.

r94834:
Extracting flag diacritics, to build regexes that can ignore them in certain
cases (like optional initial upper case). Requires new version of the gtd core.
At the same time split tag extraction in two - the first step extracts the whole
sigma set, and from that we can extract tags, flag diacritics, etc. The sigma
set extraction was greatly improved, removing a number of small errors due to
handling of reserved symbols in Hfst and Xfst.

r94398:
Added test summary for all yaml tests for a given fst.

r94272:
With feedback from Brendan I finally got the number of tests passed and failed
printed as part of the YAML testing.

r94267:
Adaption to a new version of the morph-tester.py script by Brendan Molloy. Small
adjustments to the yaml test printouts.

r94063:
Major bug fix to the generate lemma test script. Now it actually checks that the
generated lemmas correspond to the listed ones.

r94027:
Bugfix: no hardcoded language codes.

r94017:
Now also (language pair independent) morphological generators for Apertium are
installed with their correct Apertium file names.

r94002:
Added renaming to Apertium style filenames, changed installation file list to
only include files actually used by Apertium. With this change, everything
should be in place for a fully automatic integration between the GT-Divvun
infrastructure and the Apertium infrastructure through the use of pkg-config
files, with one exception: morphological generators.

r93938:
A rewritten pc file, with proper paths actually reflecting where things are
installed, and with a shortened description to better fit the use of it.

r93933:
We also need to install the pkg-config file...

r93927:
After a long discussion, the moniker 'giella' was chosen instead of gtdivvun.
Changed datadir from $(datadir)/gtdivvun/* to $(datadir)/giella/*. Added a
pkg-config file so that all installed resources can be found automatically.

r93880:
Changed datadir from $(datadir)/hfst/* to $(datadir)/gtdivvun/*, as it is the
directory used to install the gtdivvun products, and not only hfst transducers
are installed.

r93810:
Require Automake 1.11.6 to avoid errors caused by older Automake's.

r93210:
Make semantic tags optional also for dict and oahpa generators. Added support
for hfst fst's for dict and oahpa.

r93205:
Make semantic tags optional for all generators. Fixes bug
http://giellatekno.uit.no/bugzilla/show_bug.cgi?id=1854.

r93153:
Uncommented the cg3-with-apertium-tags targets, increased the gtcore version
number.

r92913:
Started work on adding hyphenators. No substantial changes, just Automake
conditionals.

r92866:
Actually made the options --disable-analysers and --disable-generators do what
they should, earlier they had no effect. Also renamed those options. Wrapped the
filter targets in mt/tools/apertium/filters/ in apertium conditionals, so that
they will only be built if the apertium option is enabled. Added separate
configure.ac option to disable the transcriptors (the num2text family).

r92827:
Make sure all tests are within conditionals - only run them if the fst's have
been built.

r92650:
Added conversion of analysis tags from GTDivvun format to Apertium format for
the vislcg3 files. The generated vislcg3 files are not valid, and the targets
are thus commented out for now.

r92404:
Added support for tmp files in the apertium target language specific analysers,
to allow local processing of those analysers.

r92369:
Rewrote tag reordering of semantic tags to use a dynamically generated regex,
and split tag reordering in three: reordering sub-POS tags, semantic tags, and
language specific tags. The two first reordering operations are done on all
languages. The reordering is done when building the raw file, to build a fixed
tag order that other fst operations can rely on. The raw file build had to be
split in two steps because of this.

r92339:
Added support for target-language specific filtering for the Apertium analysers.

r92295:
A major update to the Apertium fst building:
* corrected broken logic when building the list of tags used by a language
* build filter to remove derivation strings dynamically from the list of tags
* added a new taglist2remove...strings-regex.sh file to the core
* added a new dir filters/ within tools/mt/apertium/ for building apertium
  specific filters
* added facility to modify locally remove...strings.regex files by using an
  exception file
* build the remove-derivation-strings.regex dynamically also for regular fst's

r92170:
Now building the remove dialect tag removal filter dynamically, in the same way
as done for the semantic tags. Requires a new version of the GTD core.

r92144:
Dialect tags are now removed in the Apertium fst compilation. In addition, tags
can now be custom changed and reordered on a language pair basis, see README.txt
in tools/mt/apertium/tagsets/.

r92118:
Corrected several errors in the MT Apertium fst builds: now removing semantic
tags and tags for originating language. Silent hfst-invert.

r92100:
Modified the gttags.txt target to produce output also in cases where no GTD tags
are defined. Earlier the build would break in this case.

r92029:
Commented out another debug echo statement.

r91993:
Commented out a debug echo statement.

r91943:
Fixed a bug with optional semantic tags: we built the regex, but not the fst's.

r91821:
Corrected a bug in the lexc yaml testing. Fixed file refs in the dict fst tests.

r91682:
Moved yaml test scripts for different transducer types up one level, to
correspond to the parallel location of the fst files in the build tree.

r91672:
Generalised the yaml test runner code, to identify the relative paths of the
test scripts and the fst's being tested, so that all sorts of fst's can be
tested irrespective of where they are built. Added yaml testing for MT/Apertium.

r91623:
Moved some back-end scripts for yaml testing to the uppermost test directory, to
ease sharing of the same code across test subdirectories.

r91605:
Moved all silencing code to a separate include file (except in a few cases of
double includes). Made the yaml testing a bit more verbose when rerunning
individual tests (copy-paste testing).

r91581:
Changed target language specifc analysers to be based off of
analyser-mt-gt-desc.hfst, instead of the *.tmp.hfst file, to allow local post
processing to be applied in the step from *.tmp.hfst to the *.hfst file.

r91564:
Forgot to remove all the targets and build instructions in the old location.

r91559:
Finalised moving the Apertium MT build code to the new location. All parts have
been generalised, and the set of target languages to go with a specific source
language (when analysing) is specified in configure.ac. That is, just list your
target languages in configure.ac, and off you go. One feature still missing:
target language derivation (and other) string filtering for the source language
analyser. Coming soon.

r91448:
Reorganised MT fst building, moving it to a new dir in tools/. This is done to
avoid too much stuff in one dir (src/), and to make it easier to extend the MT
support without making the build files too large for one dir.

r91222:
Added a tmp-file step for the raw fst, to allow local/language specific
overrides when building the raw transducer. Required for Estionan.

r91100:
Added support for dialectal fst's in Oahpa. The dialect tags need only be
specified in $GTLANG/configure.ac, and all filters and fst's will be constructed
automatically.

r91028:
Made generated regex files build and be retained, as well as deleted when using
'make clean'.

r91021:
Added missing test runners, and at the same time made the test XFAILS (i.e.
expected fails due to immature code).

r91015:
Greatly improved support for the dictionary fst building:
* filtering semantic tags now work properly (removed for all but Prop)
* properly silent when using silent builds
* added dict-specific yaml test files and corresponding test runners
* removed building of reduntant dictionary fst's since we now can test
  generation and analysis independently - we only test the fst's we
  are actually going to use, in the intended "direction"; this should
  noticably speed up compilation, especially when using hfst
* all languages now build a dictionary analyser with a mobile phone
  spell relax
Also removed semantic tags from the regular analyser, as discussed earlier.
Increased the required GTD core version, as the new version is required to
fix the bug mentioned above.

r90845:
Corrected the test that triggers a FAIL in the twolc negative test script.

r90799:
Temporarily add the new twolc pair tests to XFAIL-TESTS, to make them pass as
expected fails. This will let all tests run, but will have to be reverted either
when all broken twolc tests are fixed, or when the full test suite can be run
without stopping make.

r90726:
Rewrote the test for awk to check for a feature found only in GNU awk, and use
the one found. Will check both awk and gawk, and use whichever supports the
feature (gawk on some Linux systems is renamed awk). This should make the
build configuration more robust.

r90692:
Added twolc pair string testing for Xerox. Hfst requires another type of pair
strings, and can't easily be tested at present.

r90613:
Added more tailored silent output, silenced Xerox tools as much as possible (not
very much).

r90588:
Made documentation build process work properly when using VPATH builds, and at
the same time silenced the doc build by default.

r90437:
Replaced grep in Makefile with a shell script for extracting semantic tags. This
is done to catch the case where there are no semantic tags to extract. Earlier
this caused a failed build, now it is handled properly. Required GTD core
version increased because of the new script is only found in the latest version
of the core.

r90421:
Build remove-semantic-tags.regex for all languages, since we now use it when
building speller files.

r90419:
Only test spellers if we build spellers.

r90415:
Moved the new test script processing in configure.ac up a few lines to avoid
conflicts during template merge.

r90414:
Added a test to check zhfst file validity. Not functional yet, because of a bug
in hfst-ospell.

r90368:
Fixed bug http://giellatekno.uit.no/bugzilla/show_bug.cgi?id=1830. The word
border mark removal had earlier been moved from a preceding step to the final
product compilation step. It was added to the foma speller compilation, but not
included in the hfst speller compilation for some reason. Now it is.
At the same time removed semantic tags from the speller transducers, to make the
analysis/generation string more readable when debugging - they are not used by
the speller builds.

r89769:
Fixed a build bug: it tried to build oxt files also when hfst support was not
enabled. Now oxt files will only be built if hfst is on, and spellers have been
requested.

r89623:
Fixed a problem with building zhfst files using VPATH builds.

r89587:
Added convenience upload target to upload the oxt files and make a permanent
link to the latest version.

r89570:
Fixed pattern rule error.

r89569:
Simplified the building of the hfst lexical transducer, and made it easy to use
the -F option to hyperminimise the lexical fst. Using pattern rules instead of
fixed filenames.

r89561:
Added first working build of oxt files. Not yet generalised, but working with
the paths we have. Will build Windows and Mac OXT files for all languages.

r89338:
Whitespace change.

r89333:
Stop with error if --with-hfst was requested but could not be turned on.
Based on patch by Unhammer.

r88686:
Really fixed syntax errors, and another old error.

r88684:
Fixed syntax errors.

r88682:
More comments and clear separation between the different macro sections. Added
first components for configuring oxt building - checking whether it is possible
to sync the oxt template locally from $GTHOME. Added check that all components
of the Xerox tools are installed before enabling Xerox builds.

r88677:
Added a variable to hold the LibreOffice version number where speller support
for a language was initially available.

r88669:
Made minimisation of speller automatas configurable (default=no), since it can
be extremely time and resource consuming for some languages, and the size
difference for the final fst is not very big; we might loose some speed thoug,
which needs to be tested. Reorganised the configure.ac coce by moving most code
to the m4/giellatekno.m4 macro file. What remains in configure.ac is pretty
clean and mostly easily understood.

r88620:
Renamed the last files in am-shared/ to follow the correct naming scheme.

r88615:
Removed test-src-morph-include.am - it was in reality empty.

r88596:
Moved word border removed to each fst-based speller, as we need it in the hfst
speller production (for word-based weighting) but not in the foma-based speller.
Rewrote all (fst-)speller build steps to regexes instead of hfst pipelines.

r88584:
Rewrote the mt build instructions to a regex instead of an hfst pipe.

r88578:
Renamed doc-include.am to follow the naming scheme.

r88569:
Renamed hunspell-include and listbased-spellchecker-include.

r88562:
Deleted unused include file.

r88558:
Renamed orthography-include.am and hyphenation-include.am to follow the naming
scheme. Now all src-dir includes are renamed.

r88545:
Renamed phonetics-include.am to follow the naming scheme.

r88543:
Renamed syntax-include.am to follow the naming scheme.

r88539:
Renamed transcriptions-include.am to follow the naming scheme.

r88533:
Renamed phonology-include.am to follow the naming scheme.

r88531:
Renamed disamb-include.am to follow the naming scheme.

r88526:
Removed 130 lines of code, and made the code much more readable by replacing the
long pipe of hfst commands with one regex pr target. The actual regex is a
mirror copy of the xfst regex already in the file, which means that it is also
very easy to maintain functional parity between the two architectures.

r88522:
Renamed the lexc include file to follow the correct naming convention.

r88520:
Renamed the main src include file to follow the correct naming convention.

r88405:
Completed documentation for updating gtcore.

r88390:
Added support for building disamb-oriented fst's, which include the semantic
tags.

r87898:
Corrected VPATH build of lexc files generated from xml. At the same time
silenced the XSL processing a bit, and corrected a minor configure error for
VPATH configurations.

r87890:
It was not a good idea to redefine a variable referencing itself - AM stops.

r87889:
Enabled compression of zhfst files again, it should now work across all
platforms. Made filter regex compilation quiter for xfst, and made the silent
mode more informative for filter compilation.

r87298:
Make the the xfscript compilers quiet in silent mode, verbose in verbose mode.

r87220:
When running LexC tests, if no tests were found, the test bench will now report
that the whole test was skipped. Earlier it reported a pass.

r87212:
Tailored silent build output for Vislcg3.

r87206:
Increased the actual and required version number after a small bugfix in
thespeller version easter egg, to ensure all generated spellers have proper
version info.

r87099:
Corrected a fatal bug for non-latin spell checkers: the error model contained
one letter from the easter egg not found in the acceptor. This symbol mismatch
is fatal for hfst-ospell, and caused all non-latin spellers to crash (the latin
spellers would all have this symbol ('p') anyway, so no problem was noticed
earlier).

r86991:
Corrected the compilation of xfscript files such that we still have a general
build rule for xfscript files, but now with a following inversion when needed.
Also added better feedback on the build steps in silent mode.

r86874:
Updated required and actual version number of gtdcore. The easter egg creation
for hfst spellers depends on new files in the core, and also the abbr.txt
building does so. Without an updated core e.g. speller builds will fail.

r86855:
Renamed more am-shared files.

r86822:
Renamed topdir-include.am to src-include.am to follow the correct naming
pattern.

r86820:
Experimenting with feedback on silent builds (make V=0). Looks good.

r86806:
Added Autotools support for building the abbr.txt file. This file is _not_
included in the regular make commands, one has to cd into the tools/preprocess/
directory, and to 'make abbr' there. This is on purpose.

r86774:
Added a new dir tools/preprocess/ to hold resources for the preprocess utility.

r86740:
Added automatic switch between hfst-foma and hfst-xfst for compiling xfscript
files into transducers. hfst-foma is the default, with fallback to hfst-xfst if
hfst-foma is not found. There are still issues with hfst-xfst.

r86734:
Moved xfscript compilation out of phonetics-include and hyphenation include.
These am-files contained and invert command that combined with an invert command
in the actual xfscripts created a meaningless double inversion.

r86719:
Removed unused twolc.am file. Reorganised the code for twolc and xfscript
compilation, to avoid duplicate code and prepare for improvements. Added M4
macro check that either hfst-xfst or hfst-foma is included, hfst compilation is
turned off if none of them is.

r86655:
Reduced weights for the easter egg suggestions, to avoid other suggestions to
come in between.

r86582:
Easter egg with version info now working in the hfst speller.

r86438:
Added initial version file for the hfst-based spellers.

r86359:
Explicit support for local source files and targets for the syntax.

r86355:
Added support for building (compiling into binary form) cg3 files for syntactic
functions and dependency graphs. Added a template file for syntactic functions.
Made the compiled binary files installable through 'make install'.

r86122:
Added version checking of vislcg3, renamed a couple of variables, and improved
configuration feedback a bit. Now we require a vislcg3 new enough to not
complain about recent addition of new features.

r85561:
Changed the file order when building zhfst files - there are still issues caused
by the index.xml file being non-first. Now it is always first.

r85550:
Finally fixed the libvoikko/zhfst spellers. Ready for Windows!

r85468:
Moved the common src/filters/ inside a common/ dir, to allow for other parallel
dirs like smi/ and und-Cyrl/ that target only a subset of the languages. At the
same time renamed gtshared/ to gtdshared/. This change require version 0.2.0 of
the gtdcore.

r85453:
Added the requirement to remove orig_lang-tags (OLang/NOB etc) by adding
filters/remove-orig_lang-tags.xfst also to generator fsts, not only to analyser,
for the dicts fst-s.

r85430
Fixed a bug that hindered the GTD core from finding the version info script in
the core (as opposed to installed).

r85421:
Added version checking of the GTD core: if the core is too old, configure will
stop and print an error message with instructions on how to proceed. Added an
external Autoconf M4 macro for version comparison, and renamed the file of an
existing module, to be more consistent and explicit in the filenames. This work
is done in preparations for other changes in the GTD core, which will require
the core to be updated to not render all languages broken.

r85356:
Removed the filter "remove-NG-string.regex" from the
analyser-dict-gt-norm.xfst target, in order to allow Use/NG entries in dict
fsts.

r85211:
No PCDATA text elements should be on a line of its own, that seems to trip off
TinyXML2.

r85193:
Another whitespace change to make TinyXML2 happy.

r85177:
Removed a space that tripped off TinyXML2. Tiny typo correction.

r85044:
Added some default content to the description element, to avoid hfst-ospell to
segfault.

r84651:
And with some more coffee in my system, remove-semantic-tags-except-prop.xfst
is now included in the mobile dict analyser.

r84650:
Checked in the two dict analysers with different spellrelax, but forgot 
semantic tags and orig_lang tags.

r84648:
Two dict analysers, one with mobile spellrelax, and one without. Also removing
certain semantic tags and orig_lang tags which prevent POS from being the first
tag, and messing with lookups for NDS

r84363:
Adding possibility to first look for specific regex creation shell script before
falling back to a default shell script. This will allow us to create more
complex or tailored regexes for certain tag sets (like the semantic tags), while
having a reasonable fallback for other cases.

r84350:
Keeping intermediate files didn't work, created an error. Now it works.

r84334:
Fixed a make warning, made generated regex files survive the build.

r84212:
Further cleanup of semantic tag filtering: no processing of semantic filters in
the shared makefiles.

r84117:
Remove semantic tag filtering from the common targets, it is only used by sme
and sma.

r84090:
Added rules to generate regexes automatically from the list of extracted tags.
First out is the regex to make semantic tags optional, and another to remove
them completely. Also fixed file references in the relabel targets.

r84086:
Added a rule to generate phonology documentation from xfscript, not 
only from twolc.

r83997:
Only build one file of tags, using hfst or xfst depending on the configuration.
Extract semantic tags.

r83976:
Reverted a change to hfst lexc compilation - the -f option doesn't work.

r83961:
Moved tag extraction from tagsets to filters, as it has a more general use as
the basis for dynamic filter construction. Tag extraction now works with both
Xerox and Hfst.

r83906:
Xerox will now stop on lexc syntax errors. Hfst will not until (hfst_)foma is
fixed, because foma doesn't stop on syntax errors. But one is better than none.

r83807:
Removed one harmless but irritating warning.

r83736:
Commented out weighting of the acceptor fst - it causes a segfault in
hfst-ospell.

r83655:
Added a filter to remove dynamic derivation.

r83588:
YES! Finally got weighted automatas working in the speller. Added missing hfst
tools, and sorted all the hfst tools alphabetically. Updated the required hfst
to version 3.5.1.

r82738:
Changed build files to support Hfst 3.5, requires 3.5.

r82633:
Added LexSub string filter.

r82452:
Changed voikko compression back to zip - gzip isn't voikko compatible.

r82434:
FINALLY fixed the automake 1.11 vs 1.13 test incompatibilities. Now we can allow
version 1.11, and still get the pretty output we want in newer automakes.

r82406:
Fixed references to GTCORE in test scripts. Earlier we relied solely on it
beingset in the environment, now we take it from configure (which can take it
from the environment or from a script).

r82403:
One more gzip option fix.

r82399:
Fixed argument structure of gzip - zipping was broken for hfst and gramcheck.

r82316:
Consistently use gzip instead of zip, and find gzip outside any conditionals.

r82308:
Redirected command feedback of the analyser shell script to stderr, to avoid
cluttering the analysed text in pipe use.

r82266:
Restored the Makefile and the shell script, now that the dir is merged.

r82261:
Had to remove the Makefile as well, adding only the dir in the first go - no
text replacement was done inside the Makefile.

r82258:
Removed the shell script, to take the merge in two steps: first create the dir,
then add the shell script file. This makes it possible to rename the file at the
same time, whereas if we merge the dir and the file in one go, no renaming will
take place. That leaves us with a tedious manual rename process afterwards.

r82255:
The first lookup shell script added, with supporting infrastructure.

r82231:
Added option to automatically create a language home dir environment variable.
The idea is that by setting this variable, we can reliably find transducers in
the working copy dirs of the users. The default is to not do anything (but give
a warning).

r82207:
Changed back the Automake requirement to 1.11 - 1.12 is creating too much
trouble. We'll have to see what to do with the test output - the version
requirements change must be followed by another change that will substantially
degrade test reports on newer automakes.

r82203:
Made the check for GTCORE functional, looking for both the gt-core.sh script
(and using its output if found), and the environment variable $GTCORE. This
means that there is no need anymore to set the GTCORE variable as long as one
configure, make and make install in the gtcore directory.

r82063:
Corrected bug/feedback e-mail address to one actually working.

r82028:
Made LexC compilation break on error, at least for Xerox (Hfst only gives a
warning for the same error).

r82022:
Moved the remove-illegal-derivation-strings.regex from all langs to only the
three Sámi langs actually using it. Even though potentially useful for more
languages, it can hardly be considered a language universal...

r81906:
More build rules for the grammar checker. Now it will install.

r81862:
Corrected the --enable-grammarchecker option testing.

r81857:
Changed the order of the configure macros, to allow for testing for program
availability when checking the enable options.

r81854:
Forgot to add the new Makefile to configure.ac.

r81831
Added basic build infrastructure for a CG-based grammar checker. No template
source file added yet, as this is still pretty experimental.

r81653:
Updated the filenames to match what we actually check out.

r81633:
Copy-paste error introduced scanning of a subdir test that doesn't exist for any
language but SME. Now corrected.

r81625:
Reorganised the phonetic build code to better support parallel phonetic
transcription depending on the source language of loan words and foreign names.

r81597:
Added check for the availability of 'see' when testing, to avoid bad fails on
systems without 'see'.

r81592:
Added config feedback about vislcg3/syntactic parsing status. Added config check
for the see tool (SubEthaEdit).

r81588:
Remove copying of the timestamp file for non-maintainers. It breaks the
automatic merge, and requires a revision-explicit merge for each such language.
Also added removal of originating language tags - they are only used in TTS.

r81579:
Added compilation of the remove-orig_lang-tags filter. Sorted the filter
targets.

r81562:
Improved and corrected configure feedback for spellers.

r81556:
Corrected syntax error in a test. Improved config feedback further.

r81551:
Now all speller fst's are turned off by default (I missed a few in the previous
commit). The configure feedback is slightly improved.

r81544:
Changed the default setup to only include morphological analysis and generation.
This is done to reduce the build time during regular development. This means
that to build spellers and other specialised fst's, the must now be enabled
using ./configure. Cf. bugzilla #1710.

r81095:
Corrected filter order for the text2X transcriptors.

r81078:
Completely redid the text2num etc transducers. The previous solution was in the
wrong place, and didn't incorporate the actual filtering. Now it does, but
whether this is the way it should be needs to be tested.

r81065
Another Xerox error correction - we're using LexC, not Xfst. Skipped the result
stack - not needed.

r81054:
Corrected Xerox error.

r81052:
Forgot one small make step.

r81051:
Added the inverse transcriptors, to go from text to numerical expressions.

r81022:
Wrapped phonetic / IPA conversion in a configure option, default is 'no'. Now
compiling SME with Xerox should be back to normal speed again.

r80337:
Added Remove ACR filter.

r80322:
Added compilation of the filters for the orthographic tags, and added removal of
them and the IPA strings in all regular fst's.

r80164:
Added missing hfst tool hfst-fst2strings to the M4 autoconf macros.

r79977:
Forgot to rename a variable after copy-paste.

r79929:
Reorganised the build code for dictionaries, added a dictionary option for
configure (disabled by default), and added the new filter for mobile keyboard
spellrelax.

r79408:
Still one more case of optimised lookup format removed. The underlying problem
remains, though: that the hfst tools can't take all produced formats as input.
Also added gzip compression of the att file transferred to Apertium.

r79402:
Another bugfix: switched from -f owl to -t in another case, to avoid hfst crash.

r79373:
Bugfix: hfst-substitute can't take lookup-optimised fst's as input.

r79338:
Removed a sma-specific filter that had crept in. Added att output fst to the
default apertium analyser target.

r79327:
Added missing check and variable definition for hfst-fst2txt. Several minor
changes to the apertium build instructions.

r79316:
Added missing reference to the remove-semantic-tags-except-prop filter.

r79315:
One small change forgotten in the previous commit: comments and one less target
in the default setup (more to be added on a per-language base).

r79308:
Moved MT/Apertium code to the und template from sma. Not tested, most likely
buggy.

r79152:
Added remove-variant-string.regex, for removing strings containing +v2,
+v3, +v4, +v5, but not removing +v1.

r79081:
Change echo to printf for cross-platform compatibility.

r79057:
Improved error handling in testing shell scripts.

r78956:
Removed -s option from hfst-summarise in tagsets/ and added # -> + to the 
Apertium relabel script 

r78059:
Renamed refs to template dir in preparation for support for multiple template
dirs.

r77898:
Commented out examples of error models for string and word pairs - they would in
most cases add symbols to the error model not found in the acceptor, and this
combination would crash the speller badly.

r77567:
Cleaned up speller fst building, removing all unnecessary inverts and
streamlining the code. Prepared for the introduction of weights, but commented
out for now because of bugs or inefficiences in openfst. Renamed the included
hfst speller build file, to follow an emerging naming standard for the include
files.

r77523:
Added support for making variant analysers and generators using the Apertium tag
convensions. The generated transducers are still not fully Apertium-compatible
but they are a major step forward.

r77475:
Renamed analyser-raw-gt-desc.hfst to generator-raw-gt-desc.hfst, to make the
behavior in hfst-lookup explicit and clear. Still, the "generator" behaves as
the Xerox "analyser" in hfst when in comes to composition and filtering.
Confusing, I know.

r77459:
Build the filter to remove CLB strings from speller transducers, and use it.

r77449:
Added missing hfst tools. Removed commented-out code in the index.xml file.

r77368:
Removed the ocr error model from the zhfst building, it causes libvoikko 3.4 to
segfault.

r77364:
Added an explicit copy operation into the hfst speller dir, to facilitate local
modifications of the speller transducer before further processing, by just
replacing the copy operation with whatever is needed.

r77356:
Added string pairs and whole-word corrections to the speller error model.
Added support for an ocr error model. Removed obsolete Voikko config file.
Corrected bugs in the hfst M4 macros.

r77317:
Moved the initial spell checker processing to the top spellchecker dir, to serve
as the default starting point for all spell checkers.

r77273:
Added a tagset directory in preparation for generating Apertium transducers
automatically. Corrected and expanded a few M4 macros for the hfst tools.

r76046:
Added support for testing analysers and generators only. For several of our more
specialised transducers, this is more practical and useful than always
generating both pairs of transducers to test both directions.

r75902:
Corrected the existing oahpa transducer. Added dummy hfst oahpa target.

r75594:
Corrected a bug in the hyphenator hfst build: fst's must be inverted in hfst.

r75459:
Corrected another copy-paste error that broke speller fst's.

r75424:
Corrected copy-paste error.

r75423:
Rewrote a number of targets to reflect a splitted morph boundary removal filter.
There are now three filters instead of one, to allow for more flexible fst
building for speech processing.

r75275:
Added gzip compression of foma speller transducer, and proper checks for
prerequisites. Foma spellers can now be disabled, they are enabled by default.

r75256:
Corrected a bug when building foma-based spellers. Changed one fst filename to
follow the naming scheme for the new infra. Improved building of the zfst
speller file.

r74922:
Added processing of new filters.

r74604:
Do not try to build hfst-based tools if hfst building is not enabled.

r74427:
Forgot to include changes to the filters Makefile.am in the previous commit.

r74424:
Moved some of the fst-speller building one level up, and added support for
building foma-based spellers.

r74285:
Renamed phonetics source and target files to reflect the actual purpose.

r74259:
Added possibility to build morph segmenter for those langs that have morph
boundaries marked in lexicons.

r74234:
Added a top-level misc/ dir to hold private / non-svn files needed during
development of the language. All files are ignored.

r74072:
Corrected hfst 2ipa fst: the final fst needs to be inverted before being used in
lookup.

r74012:
Corrected the homonymy and variant filters used for generators - those tags
should be optional, not completely removed.

r73986:
We require gawk specifically, not any awk whatsoever. Improved config feedback.

r73341:
Corrected reference to the built fst's.

r73159:
Updated the zhfst building to reflect recent changes in Voikko. There is now
official support for zhfst speller files, but with a new location and no *.pro
file. Also added simple support for local loading of the zhfst file -
voikkospell requires that the file is located within a dir named '3'.

r72875:
Further improvements to the test run output.

r72864:
More tweaks to make the test output compact and readable.

r72854:
Made use of the more compact modes of morph-tester.py. For all PASSed test runs,
only one line is printed.

r72836:
Moved Oahpa transducer compilation to a separate (included) file, and added
support for compiling dictionary transducers.

r72827:
We need the last part of the path to properly identify the lexc file tested.

r72823:
Made the morph-tester test runner (LexC and YAML tests) less verbose. All
messages are one-liners, except for FAILs.

r72769:
More thorough clean in src/morphology/.

r72662:
Moved the definitions of the transducer variables to the Makefile.am, to make it
possible to extend them by local modifications.

r72560:
Forgot to update the src/filter/Makefile.am file.

r72534:
Split the filter 'remove-dictionary-tags' in two to remove homonymy and variant
tags separately.

r72518:
Added filter to remove NGminip strings, ie paths that should not be used for
generating miniparadigms in dictionaries.

r72423:
Added infrastructure for building fst's for list-based spellers. The actual
building is not yet implemented.

r71921:
Remove doc build dir when cleaning.

r71912:
Forgot to update the config file.

r71905:
Reorganised the tools/ dir to fit better with coming development.

r71576:
Several adjustments to the forrest setup for jspwiki validation.

r71421:
Added support files to enable forrest validation of jspwiki files. Second part
of making sure extracted documentation comments won't break site building. Also
added another make step to actually run forrest at the end of building the
documentation. Make will now break if there are fatal errors in the jspwiki
markup.

r71326:
Corrected cut&paste error.

r71325:
Added check for forrest as part of configuring the documentation extraction.
Forrest will be used to validate the jspwiki documents during the build, to
avoid that invalid documents enter the svn repository and corrupts the web page
building.

r71071:
Upped the required automake version from 1.11 to 1.12, to avoid all hassles with
the test harnesses and backwards compatibility.

r71052:
And then the rest of the tests changed into the most portable format.

r71025:
Even more portable testing...

r71002:
Improved portability & correctness of conditional tests in the morphology
testing.

r70961:
Major update to the LexC testing. Now test data directly in the LexC code is
supported by the python test script morph-tester.py (it reads the lexc files
directly), which solves the bugs with multiple wordforms for the same
morphosyntactic inflection. It is also a bit faster than the awk solution.

r70633:
Added initial documentation extraction of CG3 files. Probably more work to be
done to get things working as intended.

r68844:
Finally found out how to get the old test behaviour back. We want the serial
tests, because it gives direct feedback to the linguists. Automake 1.13 uses
parallel testing by default, which logs all test results to files.

r68823:
Added support for processing twolc files for documentation extraction.

r68816:
Some files may contain digits in their filename. Extended the filename match
pattern for the Links target.

r68806:
Added support for automatically building a file with links to each individual
jspwiki file generated based.

r68760:
Forgot to add the jspwiki preamble file. Now added.

r68649:
Added some very basic documentation comments to the template root.lexc.

r68639:
Forgot to add support for the conditional CAN_DOCC in the previous commit.

r68630:
Added initial support for extracting documentation from comments in the source
code. Only jspwiki supported initially. Also added initial support for
extracting test data from source code comments. Only yaml tests in lexc is
supported initially.

r67871:
The final fix to get the XML-to-LexC conversion working on Cygwin.

r67860:
Concatenate all LexC source files into one file explicitly, instead of letting
hfst-lexc do it. This is more robust cross-platform, and makes the file used for
transducer compilation easily available for debugging.

r67840:
Corrected the host detection test for Cygwin.

r67828:
Made spell-relax a language-specific file by adding it to the und template.

r67827:
Added support for XSL conversion of XML source files on Cygwin.

r67732:
Made Voikko support optional instead of required.

r67724:
Fixed a stupid bash syntax error in the previous commit.

r67720:
Rewrote LexC and TwolC Xerox rules to make them work on Cygwin: the Windows
Xerox tools need a script file as input, the scripts can't be piped in as on
*nix systems. Removed the hack in the previous commit. The bug can be worked
around by avoiding linebreaks in the piped script.

r67563:
Added hack to work around a very strange bug in LexC transducer saving - the
filename is slightly garbled if the save command is passed in from a script
generated by a make file (but the same command passed in from a manually typed
script works correctly).

r67353:
More robust Saxon/Java setup: no need to define CLASSPATH. The M4 macros will
look for a couple of predefined pathnames, and pick the first saxon9he.jar file
it finds. More locations should be added as needed.

r67309:
Require at least HFST 3.4 - it includes all backends, and simplifies dependency
handling quite a bit.

r67043:
Fixed parsing of regexes for hfst, due to a bug in hfst-regex2fst when parsing
regexes with comments after the regex is closed.

r66959:
Refactored the yaml test code, moving duplicate parts to a separate file. Makes
for much easier adaption to new transducer types.

r66833;
First step in making the digit transcriptor transducers work. The transducers
are compiled, and are given proper names according to the fst naming
conventions, and the Xerox transducers work in the digit-2-string direction. The
Hfst transducers do not yet work (segmentation fault due to running out of
memory because of an infinite recursion), and the string-2-digit direction is
not yet in place.

r66815:
Made the yaml test scrips obey configuration options, ie only run the hfst tests
if hfst is turned on at configuration time.

r66329:
Automake requirement reduced to 1.11, after getting confirmation that that
version is fine for finding Python (the main problem issue that triggered the
version requirement).

r66316:
There's too much trouble with finding the correct Python version when using
Automake v.1.10. We thus require 1.12 from now on.

r66241:
The noun lemma generation test script has been updated to only test the
transducer types that have been turned on at configuration time.

r65468:
Grep out comments from regex files in orthography/, as there is a bug in
hfst-regexp2fst.

r65380:
The actual Oahpa configuration was lost! Now finally included and working.

r65377:
Forgot Oahpa configuration feedback.

r65376:
Several updates:
* slightly improved feedback from the configure script
* improved hfst spell checker building
* added basic support for building Oahpa transducers, *disabled* by default

r65370:
Renamed 'dictionary' `spellerautomaton` in giellatekno.m4. The old variable name
and printouts were confusing - 'dictionary' has manh meanings, and some very
concrete ones in the context of the GT/Divvun work.

r65289:
Minimise after every compose operation - always.

r65064:
Bug fixes to the Saxon/Java configuration.

r65002:
Call saxon checks from confugre.
In previous commit: Ubuntu version of xml2lexc with autostuff.

r64988:
Only check hfst version if requested using '--with-hfst', otherwise disable.
Likewise, disable xfst if requested and print warning if both are disabled.

r64645:
Added simple feedback to autogen.sh, valuable when processing many languages.
Some reformatting of am-shared/hfst-spellchecker-include.am.

r64584:
Removed double inversion from the hfst generator - it didn't work.

r64568:
One more syntax error.

r64567:
Corrected syntax errors introduced in the previous commit.

r64566:
Silenced the build using an Automake macro.

r64560:
Make the silent build rules backwards compatible.

r64552:
Reapplied the simplification of hfst regex expressions, now with the correct
command, thus working. Corrected hfst filter compilation. Took the first steps
in silencing the verbose make output.

r64550:
Reverted the simplification of hfst regex compilation.

r64539:
Added support for spellrelax. Also renamed the orthography include file to be
more generic. Simplified regex build commands now that the bugs in
hfst-regexp2fst have been corrected.

r63846:
hfst-preprocess-for-optimized-lookup-format has been removed from the hfst
distribution.

r63795:
Always fail hfst check if hfst-info can't be found.

r63667:
Add requirement of foma for hfst compilation; remove distinction between
WANT_[HX]FST and CAN_[HX]FST.

r63602:
The local modifications to Makefile.am files must be before the fallback pattern
targets, it seems, otherwise the fallback targets are used.

r63579:
I believe I finally have fixed the yaml testing shell scripts.

r63573:
One more fix for the yaml testing shell scripts.

r63571:
Escaping in the yaml test scripts didn't work - removing the single quotes did.
Also added an underscore in front of the transducer string in the yaml testing,
to avoid that the test scripts get too greedy when we get more transducers and
test data.

r63557:
Small variable correction in the yaml test bench.

r63556:
Forgot to escape the single quotes used within the backtic expression.

r63527:
Corrected fail check in noun lemma generation test.

r63522:
Split the yaml test runner in two, one for norm and one for desc transducers,
and updated the autoconf file correspondingly. Updated the lemma generator test
to work with the renamed transducer. Made all test runners more robust.

r63501:
Renamed all existing targets to follow the naming scheme defined at
http://divvun.no/doc/infra/infraremake/TransducerNamesInTheNewInfra.html. Also
added making of true normative and descriptive analysers and generators, as well
as moved all of the hfst speller building to the
tools/spellcheckers/hfstspeller/ dir. More explicit separation of local and
central code in src/.

r63483:
Added a simple header to the beginning of the compilation, to make it easier to
spot each new language when building all languages in $GTHOME/langs/.

r63469:
With the recent fixes to regexp parsing in hfst-regexp2fst it was possible to
bring the hfst compilation up to par with the Xerox compilation. In principle
the Xerox and Hfst transducers should behave exactly the same - any deviation is
a candidate bug in either the Xerox or the Hfst tools. This update requires hfst
3.3.14 to work properly, the requirement is added to the configure.ac file.

r63149:
Removed references to newinfra/.

r63144:
Added warning about missing YAML testing, with short instructions on how to
enable them.

r62953:
The top-level syntax include AM file had not been changed to reflect the
rle->cg3 suffix change.

r62867:
Corrected a bug in the default generate-noun-lemmas.sh test script. Made file
references more robust.

r62683:
Variables=cleaner code.

r62663:
Updated the yaml test runner to properly report the exit value of the yaml
tests, and also to give directions for how to see the details of each test if it
failed.

r62650:
Corrected typo in shell scripts.

r62648:
Several testing shell script updates: correct exit value when data files are not
found, proper use of Autoconf-made variables (will free the test scripts from
relying on the user setting up environment variables), and better checks on the
availability of test data for the lemma generation test.

r62643:
Added check for the Xerox lookup tool, which also defines the LOOKUP variable.

r62639:
Reorganised AC processing of shell scripts to be more future-proof. Added AC
variable to the AC-processed shell script to make casual by-lookers aware of the
fact that the resulting shell script file is generated by AC.

r62635:
Moved Autoconf processing of the yaml testing shell script to the top of the
list of AC_CONFIG_FILES, to avoid annoying warnings from chmod.

r62621:
Corrected error in previous commit.

r62619:
Forgot to update configure.ac.

r62617:
Refined the yaml test runner: more informative banner, ignore extra analyses
(= removes false alarms).

r62610:
Added basic setup for running YAML tests in the test/src/morphology/ dir. The
default setup will run all *.yaml files found in this dir, but this can be
modified in the shell (*.sh.in) script.

r62596:
Enable yaml tests by magic.

r62590:
Added conditional support for running python-based tests in test/src/morphology.

r62581:
Added checks for Python 3.1+ and py-yaml, and defined CAN_YAML_TEST. The idea is
that we will run the python-based tests only if the prerequisites are available
to us, and skip them if not.

r62276:
Added missing entries in configure.ac and src/Makefile.am.

r62275:
Added support for transcribing transducers, ie transducers that change the input
from one orthographical representation to another, e.g. date and time
expressions as strings or digits to the opposite form.

r62270:
Renamed the default error model file, to follow the naming scheme used in the
zhfst guidelines.

r62265:
Don't remove the *.tmp files - that destroys the dependency relationships for
(auto)make, which forces a full recompilation of all target fst's, and a lot of
extra waiting time.

r62192:
Add missing src to hfst spellchecker automaton path

r62171:
Added missing reference to dialect tag filter.

r62151:
Updated my simplistic noun generation script to be aware of its new location.

r62147:
Corrected typo in Makefile list in AC_CONFIG_FILES.

r62145:
Forgot to add the new Makefile's to the list of AC_CONFIG_FILES.

r62141:
Reorganised the test dir, in anticipation of a larger set of tools and source
types in need of testing.

r62122:
Added test/data/typos.txt to hold a list of collected typos. The list is used
both for testing spellers, and as part of the preprocessor used with the Xerox
lookup tool.

r62084:
Added core filters for making transitivity and semantic tags optional in the
default generator. This change fixes most of the generation issues.

r62039:
Generalised the local/language-specific *fst processing.

r62010:
Adding support for local/language-specific *fst processing.

r61971:
Reverting the copy source file step for $GTCORE/gtshare/src/filters/, replacing
it with direct compilation. The main reason for copying was to have the source
files available for distribution, but that is not required, since the
distribution and installations of the languages depends on the presence of
GTCORE. Instead we risk that people start to add the copied source files to svn.

r61965:
The GT_FILTER_TARGETS variable needs to be parametrised for HFST and Xerox also
for the local modifications.

r61961:
Several small typos and glitches in building the main xfst's fixed.

r61959:
Removed an extraneous backslash that broke compilation.

r61954:
Hopefully this version of cp will work.

r61953:
Forgot to force the copy, which stopped the compilation.

r61949:
Added several (tag removal) filters from the old infra, and added compilation of
them as well. Applied (composed) them on the generator and the analyser, such
that the gen. and anal. should now produce the same output as in the old infra.
Corrected the README file.

r61776:
Remove duplicate methods to enable toolkits; --with-options are to be used
with optional path to tools. The automake tree will still have two
conditionals on whether CAN and WANT hfst, xfst or somesuch

r61734:
Added border removal to the basic analyser and generator, such that they become
useful.

r61731:
Made the first test script more robust: it bails out if no transducer is found,
and gives basic feedback to whether it is testing Xerox or Hfst. The test data
files are not deleted after the test run, so that they can be easily inspected
if needed, even after a successful test run. It also uses the more common
filename and morphological tags (=less typing by default).

r61723:
Added the first test script: it tests whether noun lemmas do generate. The
script does contain some language-specific bits, and must thus be adapted to the
requirements of each language.

r61712:
Corrected reference to inituppercase.?fst.

r61708:
Corrected compilation of hyphenation rules.

r61706:
Corrected compilation of phonetic/orth2ipa rules.

r61685:
Added basic structure for hyphenation and conversion to IPA.

r61658:
Added an empty Hunspell dir to indicate the home of Hunspell building.

r61640:
Removed (auto)make processing of the now deleted src/spellchecker/ dir.
Refactored the phonotactics processing, such that xfst script files are on an
equal foot with twolc files, and such that it is easy to switch from one to the
other. Also added some basic template files for disambiguation and dependency
tagging, taken from the faroese source.

r61605:
Reorganized spell checker build structure, moving it to a new tools/ dir, which
will also be home to other applications of the basic linguistic analysers.

r61603:
Added build support for xml source files.

r61520:
Added initial support for xml source files.
NB! The support isn't fully according to GNU (autotools) standards yet, but will have to do for the moment.

r61472:
Reverted the ignores - merging wasn't working properly for properties, and
caused a lot of noise. Hopefully this commit will cancel that noise.

r61464:
Added svn:ignore on most dirs, in the hope that they will be copied over to the
language dirs.

r61458:
A lot of cleanup and corrections:
* suffix rules in more places (although not in all - that is not possible)
* removed automake warning about pattern rules - we need them
* checked all *-include.am files for consistency, added missing Xerox and HFST
  targets were needed, corrected vars to HFST tools, added comments and
  generally made the files easier to maintain (I hope).

r61422:
Only pattern rules for uppercasing targets.

r61380:
Replaced pattern rules with suffix rules in phonology (twolc) processing. First
step in switching to suffix rules everywhere, for backwards compatibility, and
fewer automake errors. Also added support for compiling phonologies written as
xfst script files.

r61346:
Added Autoconf processing of the Makefile.am files in test/.

r61318:
More elaborate dir structure in test/, added Makefile.am files everywhere in
test/, added test/ to SUBDIRS.

r59870:
Include filters and indent properly, use gtcore version of editdist.py in
spellers

r59759:
Corrected syntax error, renamed a couple of errors.

r59755:
Finally added initial uppercase to the actual analysing transducer. NB! It
presently works only for Xerox transducers. The code for HFST looks correct to
me, but it still doesn't work. Will have to debug this further later.

r59728:
Corrected file suffix match for xfst script files.

r59707:
Revert tests

r59705:
Test commit

r59643:
Added uppercasing compilation including proper processing for HFST. Uppercasing
is not yet applied to the output transducers. That is coming in a second
commit. Added a variable for hfst-xfst (and tested for its existence).

r59115:
Corrected dependency for speller transducer. Corrected filter path.

r59113:
Correct the path of merge instructions.

r59079:
Added filter compilation and processing of fst's.

r58484:
Corrected path to speller metadata files, corrected commands for them.

r58482:
Updated zhfst building according to new location of files.

r58478:
Replaced direct command calls with variables, completed and corrected a couple
of commands, made the transducer commands consistent across applications.

r58471:
The timestamp file has been renamed, with corresponding changes in the
Makefile.am.

r58456:
Making the echo silent for clean output.

r58409:
Renamed a couple of folders and files, moved hfst & voikko speller metadata
files to a more appropriate place.

Updated the top Makefile.am to properly check for changes to this file before
building anything else.