On the way to an improved metadata display:
(1) weird file names that can be simplified without problem
- '__', '...', '_-_'
aaarjel_faaroe__raettad_oppgave....doc.xml
a_saaran_gaerja_-_biejjeste_beajjan.doc.xml
==> aha, there a special characters in some filenames: bug
(2) bug: files right in the main category
bc_sma_ficti.txt:a_saaran_gaerja_-_biejjeste_beajjan.doc.xml
bc_sma_ficti.txt:aaarjel_faaroe.doc.xml
bc_sme_ficti.txt:aigin_lavra_-_olles.doc.xml
bc_sme_ficti.txt:aikio-sme-001-005corr.txt.xml
bc_sme_laws.txt:lov_om_psykisk.nob.doc.xml
(3) bug: inconsistent naming (cip has corrected this type of inconsistencies in the freecorpus)
bc_sma_admin.txt:other
bc_sma_news.txt:other
vs.
bc_sma_bible.txt:other_files
bc_sma_facta.txt:other_files
bc_sme_bible.txt:other_files
bc_sme_facta.txt:other_files
(4) bug: wrong grouping: avvir and MinAigi have obviously to different dir on the same level yet they
belong to one and the same group "data stemming from ..."
bc_sme_news.txt:avvir.no
bc_sme_news.txt:Avvir_xml-filer
bc_sme_news.txt:MinAigi
bc_sme_news.txt:minaigi.no
(5) inconsistent info in metadata (I hope we will not need all but about firstname and lastname?):