Here’s a rather technical (and long) explanation of what I meant above by “coding to the interface”:
Since version 1.1, CLDF includes a MediaTable, and a mediaReference property to link from other tables to media.
This can be used to e.g. link audio files to forms in a Wordlist, as is done in Henrik Liljegren’s Hindukush data.
In the following, I’ll try to show how such a specified data format helps with tool development (enabling things such as the audio glossary talked about above).
The price of a flexible spec is paid in higher complexity of implementation. So the flexible URL discovery and different URL schemes (such as http:, file: or data: URLs) are best supported in a basic CLDF library such as pycldf
- rather than implementing bits and pieces of it in higher-level tools. As of version 1.26 pycldf
provides a python API, abstracting away the details of the spec, and instead offering straightforward File.read and File.save methods for File
s (which can easily be derived from rows in a MediaTable). With this API, implementing a commandline program to download all media files for a dataset boils down to a handful of lines of code.
So with a specification and implementation in place, we can go about putting our audio glossary together. Since HTML creation is typically done using templates, the process has two steps: assembling the data for the template and the template.
Then, creating an HTML page like the attached file, is as “easy” as running
cldfbench cldfviz.audiowordlist liljegrenhindukush/cldf/cldf-metadata.json cldf:name=hand -o test.html
Note: The cldfviz
functionality demonstrated above isn’t released yet. Thus, you’d need to install cldfviz
from a repository clone to reproduce this locally
.
test.html (22.7 KB)