Hi all!
I’ve recently started working with some of Jochelson’s Yukaghir legacy materials from the late 19th century. There’s a collection of 100+ texts (in Cyrillic and some sort of Roman transliteration), a grammar sketch and vocabulary list. I’m trying to develop a corpus with these materials and contemporary texts, and hopefully my own fieldwork.
I have a question for you @pathall, since you’ve mentioned JSON. I’m quite new to this; I’ve been mostly going with XML for each text (following the BNC structure). Would you recommend a different encoding?
Albert