The directory {{pedversions/sjdoahpa/sjd/src}} contains the entries used in the current online version of sjdoahpa. The russjd and engsjd files will come in {{pedversions/sjdoahpa/sjd/russjd}} and {{pedversions/sjdoahpa/sjd/engsjd}}. !!!Test @cip reverted sjdrus-data to russjd and installed the new Leksa (at the moment the restriction elements are ignored) ==> test the new Leksa online in both directions! !!!TODO correct the restrictions in the translations {{{ моаӆӆьчхэ облезать о шкуре облезать облезть to grow bare about a skin to grow bare }}} this way {{{ моаӆӆьчхэ облезать облезть to grow bare }}} !!!TODO correct inconsistencies in the verb file: eng verb either with 'to' or without {{{ v_sjdrus.xml: to sympathize v_sjdrus.xml: will hesitate for a time v_sjdrus.xml: to come nearer v_sjdrus.xml: to become hardened v_sjdrus.xml: to become callous v_sjdrus.xml: to harden v_sjdrus.xml: to become callous v_sjdrus.xml: to make someone sincere v_sjdrus.xml: to become sincere v_sjdrus.xml: to move v_sjdrus.xml: change a place v_sjdrus.xml: to pass to a new place }}} !!!TODO Unlike Trond's claim, no all sjd lemmata have an eng translation: {{{ кэ̄ннц ноготь то̄лл огонь костер место для костра Total of xml>grep -h '_ENG' *.xml | wc -l 49 Here is the list xml>grep -n '_ENG' *.xml xml>grep -n '_ENG' *.xml n_sjdrus.xml:1579: ноготь_ENG n_sjdrus.xml:1597: огонь_ENG n_sjdrus.xml:2723: лосиха_ENG n_sjdrus.xml:2933: важенка_ENG n_sjdrus.xml:4608: pulp ляшки_ENG n_sjdrus.xml:5525: оленята_ENG n_sjdrus.xml:5586: олененок_ENG n_sjdrus.xml:5724: морозец_ENG n_sjdrus.xml:5739: морозец_ENG n_sjdrus.xml:5754: морозец_ENG n_sjdrus.xml:5812: сиг_ENG n_sjdrus.xml:5826: кумжа_ENG n_sjdrus.xml:6020: пинагор_ENG n_sjdrus.xml:6049: каменки_ENG n_sjdrus.xml:6050: мальки_ENG n_sjdrus.xml:6065: сиг big_ENG n_sjdrus.xml:6066: big сиг_ENG n_sjdrus.xml:6122: хариус_ENG n_sjdrus.xml:6136: хариус_ENG n_sjdrus.xml:6377: smell варенной fishes_ENG n_sjdrus.xml:6447: бражка_ENG n_sjdrus.xml:6886: лопанье_ENG n_sjdrus.xml:6914: лопанье_ENG n_sjdrus.xml:7243: шамшура_ENG n_sjdrus.xml:7300: zone part on female ярах_ENG n_sjdrus.xml:7332: skin дублённая_ENG n_sjdrus.xml:7361: позументная tape_ENG n_sjdrus.xml:7390: valve on man's ярах_ENG n_sjdrus.xml:9687: вежа_ENG n_sjdrus.xml:10103: pure place in куваксе_ENG n_sjdrus.xml:11309: круча mountains_ENG n_sjdrus.xml:11617: озерко_ENG n_sjdrus.xml:11737: корга_ENG n_sjdrus.xml:12211: thickets ивника_ENG n_sjdrus.xml:15740: сонорный a sound_ENG n_sjdrus.xml:15754: deaf сонорный a sound_ENG n_sjdrus.xml:15782: deaf сонорный a short nasal sound_ENG n_sjdrus.xml:15796: deaf сонорный a long nasal sound_ENG n_sjdrus.xml:15810: deaf сонорный языковый a short sound_ENG n_sjdrus.xml:15824: deaf сонорный языковый a long sound_ENG n_sjdrus.xml:16663: cuffs малицы_ENG n_sjdrus.xml:16729: малица_ENG pron_sjdrus.xml:124: which-nibud_ENG v_sjdrus.xml:1525: тошнить_ENG v_sjdrus.xml:1709: тошнить_ENG v_sjdrus.xml:4584: small шинковать_ENG v_sjdrus.xml:4805: tax to give_ENG v_sjdrus.xml:4822: is subject to the supreme court_ENG v_sjdrus.xml:4984: will hesitate for a time_ENG }}} !!!possible future todo @Micha: a few observations: * ё vs. е in Russian (e.g. вдвоём / вдвоем); perhaps we should consistently use ё in the xml, but include е (with spellrelax) for oahpa users? * the semantics should be checked (does the other oahpas use predefined sets of values?), e.g. why is э̄ххт тоа̄фант one thousand "HUMAN", or why is кутӭ-кутӭ two each "HUMAN" and "FOOD"? It could be anything: "cars", "reindeer", "xml databases", etc. * common (uni)coding issues (perhaps we can apply a script to future incoming data): ** Latin letters in Cyrillic: a --> а, o --> о, etc. (even in Russian text) ** Precomposed vs. combining diaeresis: ё --> ё, ӓ --> ӓ, ӭ --> ӭ ** Precomposed vs. combining macron: ӣ --> ӣ, ӯ --> ӯ * several multi word lemmata, like э̄ххт чӯдтҍ or югкеналла лыдцант or пя̄лла ӣнсэй оанҍхэсь нюннҍ тӣххт (especially the latter two are definitely not lemmas, but paraphrases) ** there are even entries with multiword expressions both as lemmas and translations, like: {{{ ко̄ппче соа̄йметҍ собирать сетки to collect grids }}} Can we use these for a vocabulary trainer? * English verbs with(out) "to"? (e.g. undress vs. to dress) * free word order in Russian NP? (e.g. хвост короткий and короткий хвост) * attr. vs. pred. adjectives (in sjd and rus!) * translations needs to be checked carefully: cf. this example: the basic meaning of this Kildin word clearly means "unfamiliar, unknown", of course in some situations this can also be expressed as "new", but as a translation of the lemma >eehk< "new" is clearly wrong, especially in a vocabulary trainer (in a true dict we could give this "new-meaning" in an example sentence) {{{ е̄ххк незнакомый новый unfamiliar new }}} * inflected form as lemma {{{ углясьт в уголке in a corner }}} pos="adv" is wrong, because this is an inflected noun (which can of course be used as an adverbial); I understand that such forms should be used for training vocabulary, but we have to find another tag for the pos value here * this is not a "dim_set", but a "pl_set"! {{{ па̄лл мяч ball па̄л мячи balls }}} we could of course use the oahpa for the training of inflectional forms, but is it useful to have plural forms mixed up with diminutives?