Found via the California Language Archiveâs Facebook page:
CoCoON for âDigital Oral Corpus COllectionsâ is a technical platform that supports oral resource producers in creating, structuring and archiving their corpuses. A corpus can be composed of recordings (generally audio), possibly accompanied by annotations of these recordings.
The resources are first catalogued and stored, and then archived in the TGIR Huma-Num archive. The author and his institution remain responsible for the deposited materials and can benefit from restricted and secure access to their data, for a defined period of time, if the content of the information is considered sensitive.
That post links to this text resource on Nivaclé, a Matacoan language spoken in Paraguay and Argentina:
It looks quite nice, and it seems quite straightforward to get to annotations of texts from the overview page. It works find in Chrome/Mac, but Firefox didnât seem to want to play the .mp4
, however, and downloaded it instead. I was also rather surprised to realize that there are only annotations of the Spanish translation, and not the NivaclĂ© itself â there doesnât seem to be any metadata in the overview page that indicates that (of course, even text of the translation is better than no text at all).
There is an interactive playback view (click on âShowâ) which shows the available annotations, which worked in Chrome:
Update!
Hey, I poked around some more and found a text that has more annotations:
https://cocoon.huma-num.fr/exist/crdo/meta/cocoon-1087d569-5011-46b1-87d5-695011b6b125?lang=en
This is a text in âTaâizzi-Adeni Arabicâ, and as you can see there are multiple tiers of annotation.
The resources download nice and tidy, like this:
And hereâs what the LACITO XML format looks like for a single sentence:
<S id="ACQ_MCSS_NARR_04_chat-borgne_01">
<FORM>ginni hÄda</FORM>
<AUDIO start="1.87" end="2.84"/>
<W>
<FORM>ginni</FORM>
<M class="">
<TRANSL xml:lanf="fr">Djinn</TRANSL>
<FORM>ginni</FORM>
</M>
</W>
<W>
<FORM>hÄda</FORM>
<M class="">
<TRANSL xml:lanf="fr">DEM.M.SG</TRANSL>
<FORM>hÄza</FORM>
</M>
</W>
<TRANSL xml:lanf="fr">Sâagissant de ces ginns.</TRANSL>
</S>
All very straightforward. Hooray!