This page documents metadata categories and subcategories as well as labels we use for these metadata in the [Freiburg-Tromsø Speech Corpora|freiburg.html]. Project-internally we collect different kinds of metadata. Not all of them can be made public due to ethical and legal reasons. Here we document only metadata categories relevant for the corpora published through Korp. Main metadata categories describe: *Actors (e.g. a recorded speaker, author, translator or annotator) *Sessions (e.g. an annotated recording or an annotated written text) *Texts (e.g. modality or genre) All publicely available metadata is stored in files separated from the [ELAN|ELAN.html] annotations in IMDI format on the Session node in the [TLA|TLA.html]. A script (which does not yet exist) converts IMDI into a structure useful to be read into the Korp interface. !!!Actors *Speakers (e.g. informants/consultants recorded and transcribed or authors/translators of written text included in the corpora) *Annotators (e.g. PIs or assistants transcribing, translating or otherwise annotating recordings or written text included in the corpora) !!!Sessions *Actors *Date *Equipment *Media *Place *Project *Languages !!!Texts !!Actors !!Date !!Language(s) !!Modality As a label for this category we use _Modality_ and mean here the way by which signs are transmitted by a sender. This catory has two values: *oral (e.g. speech which we have recorded on audio or audio+video and transcribed or speech which is transcribed, but where there is no audio available because it is lost or the speech was transcribed without being recorded) *written (e.g. handwritten or printed texts, texts published online) Another potential values (not relevant for our projects) are: *gestured *signed Note that the kind of perception by a receiver is not relevant for our metadata categories (a written text can be received oraly if we use text-to-speech, etc.) Neither does _Modality_ in our sense refer to the actual medium (paper, video, etc.) !!Language The-letter code in accordance with ISO 639-3 !!Genre *poetry *fiction *ritual *advertisement *biography *fairy tale *facta *idiom *narrative *teaching *story !!Register *formal *informal *neutral !!Medium !!!Other conventions Note that also file names used by us inlcude some metadata already. For instance: *sms19610000lagercrantz318 *sjd20150609aaa-sport where the first three letters __sms__ or __sjd__ - in accordance with ISO 639-3 - always mark the language (or main language) of a given session, the following eight digits __19610000__ or __20150609__ always mark the date of a given session in the format YYYYMMDD. If the exact date is unknown or cannot be specified (e.g. in a book publication were only the year is given) we use the digit 0. !!!See also XXX - ???