!!!Corpus maintenance This document keeps track of measures to improve the corpus collection and conversion process. Note also the [sentence alignment page|/tools/tca2.html], which looks into that specific sub-part of the corpus maintenance. !!Corpus improvement work !Mappestruktur osv * news: muligheter for å flytte fra bound til free? * science: vi har filer sme/science både i free og i bound, uten noen klar deling * sma: eget valg for klassiske tekster !!!Tasks !!Where do we find texts * [A sketch list of where to find Saami text online|SaamiTextOnline.html] !!Parallel texts * [Suggestions for detecting (flaws in) parallel texts|corpus_parallel_maintenance.html] * [How to manipulate the conversion of different file formats, in xsl, to get the correct language|CorpusConvertingManipulation.html] !!!Meetings in the corpus improvement project * 2019: [ 7.6.|../admin/corpus/Meeting_2019-06-07.html] * 2017: [ 3.3.|../admin/corpus/Meeting_2017-03-03.html] // [ 25.4.|../admin/corpus/Meeting_2017-04-25.html] // [ 6.9.|../admin/corpus/Meeting_2017-09-06.html] // [ 5.10.|../admin/corpus/Meeting_2017-10-05.html] * 2016: [ 26.10.|../admin/corpus/Meeting_2016-10-26.html]// [ 02.11.|../admin/corpus/Meeting_2016-11-02.html]// [ 16.11.|../admin/corpus/Meeting_2016-11-16.html]// [ 25.11.|../admin/corpus/Meeting_2016-11-25.html] * 2014: [ 12.3.|../admin/corpus/Meeting_2014-03-12.html] * 2012: [ 12.1.|../admin/corpus/Meeting_2012-01-12.html] // [ 19.1.|../admin/corpus/Meeting_2012-01-19.html] // [ 25.1.|../admin/corpus/Meeting_2012-01-25.html] // [ 1.2.|../admin/corpus/Meeting_2012-02-01.html] // [ 7.2.|../admin/corpus/Meeting_2012-02-07.html] // [ 13.2.|../admin/corpus/Meeting_2012-02-13.html] // [ 17.2.|../admin/corpus/Meeting_2012-02-17.html] // [ 29.2.|../admin/corpus/Meeting_2012-02-29.html] // [ 12.3.|../admin/corpus/Meeting_2012-03-12.html] // [ 22.3.|../admin/corpus/Meeting_2012-03-22.html] // [ 31.8.|../admin/corpus/Meeting_2012-08-31.html] * 2011: [ 7.4.|../admin/corpus/Meeting_2011-04-07.html] // [ 11.4.|../admin/corpus/Meeting_2011-04-11.html] // [ 3.5.|../admin/corpus/Meeting_2011-05-03.html] // [ 27.6.|../admin/corpus/Meeting_2011-06-27.html] // [ 12.9.|../admin/corpus/Meeting_2011-09-12.html] // [ 21.9.|../admin/corpus/Meeting_2011-09-21.html] // [12.10.|../admin/corpus/Meeting_2011-10-12.html] // [ 7.11.|../admin/corpus/Meeting_2011-11-07.html] // [11.11.|../admin/corpus/Meeting_2011-11-11.html] // [25.11.|../admin/corpus/Meeting_2011-11-25.html] // [28.11.|../admin/corpus/Meeting_2011-11-28.html] // [ 8.12.|../admin/corpus/Meeting_2011-12-08.html] // [14.12.|../admin/corpus/Meeting_2011-12-14.html] // [20.12.|../admin/corpus/Meeting_2011-12-20.html] !!!OCR and conversion errors leftover from spring 2011 * [OCR error overview, May 2011|corpus_ocr_may11.html] (still open issues here) * [Conversion errors|corpus_conversionerrors_may11.html] (open issues here?)