!!!Meeting with Polderland 12.9.2006

Participants:
* Peter Beinema
* Thomas Omma
* Sjur Moshagen


!!Agenda

* questions and answers


!!New team members Polderland
* Marijke Koster
* Jeroen Daanen


!!Since last time

Thomas has sent more data to PL, and Peter has made a script to process and
"refactor" the "stems" to something more edible by the PL technology.

!!Polderland tasks

Will try to reach a conclusion whether the present approach is acceptable, or
whether a FS machine is a better solution all-in-all.

Now splitting the word forms into subgroups that behave in more or less the
same way. The number of subgroups can be fairly large.

Split words into stem + derivation cluster
3000*10000 -> 3000+10000

Current derivation system overgenerates. Unclear whether this is a problem in
actual use: PLD compounding also overgenerates, and this is not perceived to be
a problem.

Additional problem: not all derivations are possible for all stem forms.
Ideosyncratic? Can this be tackled by grouping words in a smart way, and have each group have it's own set of derivations?

Possibility: limit derivations and add additional forms as "full lemma's"

!!Hyphenation

PL is looking for a recent CodeWarrior to compile Universal binaries for
InDesign.

The next version of InDesign will most likely build with XCode. Thus,
CodeWarrior is only needed for the present InDesign CS2.

!!!TODO
* send processed data sets back to Thomas (__Peter__)
* review the processed data sets (__Thomas__)