This document is for discussion notes
=====================================


This command gives transcription + POS:

cat corp/pedkorpus.txt | preprocess | lookup -flags mbTT -utf8 bin/pos-sme.fst | tr '       ' '+' | cut -d"+" -f1,3 | uniq | lookup -flags mbTT -utf8 bin/phon-sme.fst | cut -f2 | l


Discussion 22th aug
===================

text -> /syntactic analysis/ -> intonation marking -> phon repr -> sound gen


lexical lookup:

find base form, pick enriched graphemes, use them in conversion to phon repr

e with dot below = e7 in our internal repr


LU čájehit+V+Ind+Prs+Sg1 sme.fst + sme-dis.rle
   ---------------------
LL čáje7hán 
	=> tʃæjəhæn
 
PU čáje7hán
   --------
PL čájehán

compiles into

LU čájehit+V+Ind+Prs+Sg1
   ---------------------
PL čájehán
		=> tʃæjehæn (given the transducer phon-sme.fst


=> e7 disappears, we must rewrite our stuff in order to put it into


"Diphthong Simplification in i-Stems before Suffixes Beginning with j:"
  Vx:0 <=> Vow: _ Cns:+  i  ( %>: ) ( »: ) X5: ;
	where Vx in (e o a) ;          ! goah'tiX5jd:go0điid

"Diphthong lengthening in Simplification in i-Stems before Suffixes Beginning with j:"
  Vx:Vy <=>  _ Vz: Cns:+  i  ( %>: ) ( »: ) X5: ;
	where Vx in (o  i  u) ;          ! goah'tiX5jd:go0điid
	where Vy in (o9 i9 u9) ;         ! goah'tiX5jd:go0điid
	where Vz in (a  e  o) ;          ! goah'tiX5jd:go0điid
	
oa => o9
ie    i9
uo    u9


johka
jo0ga

gođii go:ðij
	
	
	leksikalsk transducer, fjern LU, då står vi att med ein einnivåmodell som inneheld o9
	sp legg vi på tonivåmodellen
	
LU	goahti+N+Sg+Gen
LL	goahtiX4
	
PU	goahtiX4
	twol
PL	goađi

66+67 = leks. trans.

69-71 = twol

1 - fjern oppsida av leks. (LU:LL => LL)
2 - komp. twol (PU:PL)
3 - kompos. ein-nivå-leks med twol .o. (LL/PU -> LL:PL 
4 - kompos. 3 med genererande IPA-twol: IPA:PL

IPA
twol
LL

IPA:LL