šŸ“„ [Paper] LingView: A Web Interface for Viewing FLEx and ELAN Files

I’m not so sure. I don’t see ELAN and Flex as barriers, necessarily. Documentation stored in those formats is reasonably future-proof, because they’re pretty parseable. After all, XML is just plain text, and that’s what both ELAN and Flex output.

I will say that the .eaf format itself is the kookiest format I have ever laid eyes on! I have written .eaf parsers and it took me a looong time to figure out what the heck was going on. That may very well have to do with ELAN’s original design criteria, which from what little I understand included goals that were quite different from documentary linguistics.

Even so, it’s not that hard. (Here’s one that’s less than 100 lines and can handle simple .eaf files). The tricky thing with ELAN, like with Toolbox, is that it’s perhaps too flexible. You can create all kinds of tier relationships (though many tier relationships that seem pretty obviously helpful are… hard if not impossible in .eaf). And of course, everyone does it in a different way.

But to me, the biggest problem with .eaf is simple: it doesn’t seem to be designed for interlinear anything (new forays into glossing notwithstanding, may they blossom). And that is probably the need for documentary linguists.

As for Flex, well, Macs. That’s pretty much a deal breaker for a lot of people. (I confess, sheepishly, that I myself am one of those people.)

Anyway, any alternative that overcomes those two problems when people are documenting (not via stitch-and-bandage after-the-fact ELAN/Flex surgery) will have an audience. And that’s all that matters, AFAICT.

(As for the grammatical categories topic, I’ll respond to your interesting comments on that in its own thread since it’s not quite relevant to the current discussion about LingView and so forth, and I hope we can talk more about it.)

Quick update from communications with the LingView team:

  • They have now actually implemented linking to timestamps thanks to our input (GitHub - BrownCLPS/LingView at prodready) :grinning:
  • They agree that linking to repositories through APIs would be a valuable feature and welcome contributors to develop the feature. There are other features they have planned to develop on their own at the moment. Would anyone be interested in actually developing this feature for LingView using e.g. the OSF API?
1 Like

Yes, ELAN/.EAF offers a significant feature (time-alignment) which is valued for the beginning and end of the documentation pipeline, but not the middle. I haven’t even successfully completed a full ELAN-FLEx-ELAN cycle with any of my data, but just thinking about it you realize that it is an iterative process of ELAN-FLEx-ELAN-FLEx-ELAN-FLEx ad infinitum, and conceivably you could have some data in the post-FLEx .EAF files that can’t be processed by FLEx and you will need to manually merge the new output of FLEx with the pre-FLEx data every time.

Right, in this regard FLEx aims for accessibility over accuracy/explicitness, in a way that is distinctly not ELAN-like. I think people complain about both applications for these opposite reasons (i.e. FLEx feeds you too many pre-defined categories which simultaneously aid and limit cross-linguistic comparison, while ELAN gives you complete flexibility with no standards to follow). This is also reflective of e.g. Martin Haspelmath’s descriptive categories vs. comparative concepts distinction. Martin argues that linguistic description should be as specific to the individual language as possible, so he believes that having too much flexibility is a good (or even necessary) problem to have. As I understand it, FLEx in principle allows you to use your own categories, but it also very strongly encourages you to use pre-defined and often ambiguous categories - and perhaps, as you suggest, it isn’t possible to define custom-made categories within the data itself.

One upshot of all of this is that if we value accuracy over accessibility then the interlinearization portion of an all-purpose fieldwork app will not be as user-friendly as FLEx. The ELAN interlinearization is a good start in that direction, but as we all know it is still lacking some critical features. Another upshot is that, in my opinion, use of FLEx for interlinearization, either in conjunction with another app/data format or not, will always be a barrier for accurate language documentation. This is the point at which the underlying goals for the development of FLEx do not align with the underlying goals of academia, and I don’t imagine we can expect those underlying goals to change.

So, yes, a new data format would be great - but also an app with the power of FLEx and the flexibility of ELAN.
Edit: oh yeah, and online live collaboration. :sweat_smile:

Quick update from Kalinda (one of the LingView developers):

ā€œWe just added URLs to specific annotations for all documents, so it’s no longer limited to files with audio or video!ā€

1 Like