Thanks for your thoughts!
yep yep yep yep yep
I agree that these are pretty much a sine qua non. Also, we need to be handled content in these data types in a way that isnāt just āplain textā, but can handle structured representations of linguistic units, alignment, etc. Otherwise we may as well just be using existing tools like Zoom, Facebook chat, Dropbox, etc. (Not that those arenāt useful tools in many cases.)
I came up with the list below as a sort of āumbrella listā of ⦠well, āaspectsā of what weād need to think about?
(Silly emoji symbols for fun, alternative suggestions welcome
):
Media types (in increasing order of difficulty)
Text
Images
Audio
Video
Connectivity modes
Offline (actually I think this is worth thinking about, even just as a foil to the fully collaborative, networked, blue-sky scenario)
Online
Real-time collaboration
Archiving
Versioning
Archiving
Linguistic data types
Words
Morphemes
Texts (in the sense of a narrative, a conversation, whatever other genre)
āSentencesā (interlinearized āsegmentsā⦠IGTs⦠whatever you want to call them!)
Dictionaries
Orthographies
Phonetic inventories
(Language) Metadata
Paradigms (paradigm⦠pair oā socks⦠HAHA. Ehem.)
The last category, linguistic data types, is pretty much in line with what Iām trying to implement in the software library Iām writing about in my dissertation (itās called docling.js
).
Letās think interface
These are all worthy areas to keep in mind. Each has its own technical expertise requirements: actually implementing these (connectivity, versioning, etc) in turn requires skill development, testing, etc.
But letās just imagine that these things magically 
work. What does the actual user interface look like? What are the buttons? What are the boxes? Where do the video player and the record button and so forth live on the screen?
I think this kind of āparticipatory designā is a great way to start, actually.
It reminds me of an issue that came up in @lgesslerās paper on āDeveloping without developersā (PDF): the exercise there was to reimplement ELAN on the web by cloning the interface quite precisely. The fidelity is quite amazing:
Hereās the quote about why cloning ELAN was chosen as the task:
Choosing an ex-isting app obviated the design process, saving timeand eliminating a potential confound. ELAN waschosen in particular because of its widespread usein many areas of linguistics, including LD, and because its user interface and underlying data structures are complicated. We reasoned that if our approach were to succeed in this relatively hard case,we could be fairly certain it could succeed in easier ones, as well.
ELAN is certainly a complicated interface, and the web port is an amazing achievement. But ELAN has evolved to do what it does over a very long process of development. The key fact to my mind is that process had a starting point ā an initial idea ā that only included some of the tasks that documentary linguists & colleagues are concerned with (namely, media/transcription alignment). Other things have been added incrementally, but I think that if ELAN were redesigned from scratch it would have not just different capabilities, it would have a very different interface.
Thatās where Iād like to gently nudge this conversation⦠we have know what the ingredients are, but what does the cake look like, even if we donāt have a recipe?
(I just appended the title of this topic to include user interface to emphasize this⦠might be a record for longest title!
)