What’s a <ruby>
, you say?
This tag doesn’t work in all browsers.
Check out this rather lovely-ly formatted page about Japanese dictionaries:
If you take a look, you will see stuff like this:
The topic du jour is this stuff:
Kokugo 国語
As you can see, there are three kinds of writing going on there, and this is a feature of Japanese. The three are:
- Romaji: Kokugo
- Kanji: 国語
- “Furigana”: こくご
Notice how the Furigana are above the Kanji. There are special HTML tags specifically for doing this. The term “ruby” itself has a rather interesting and circuitous history, but let’s
Here’s what the HTML for 国語 looks like (I’ve simplified it a bit from the page above):
<ruby>国語
<rp>(</rp>
<rt>こくご</rt>
<rp>)</rp>
</ruby>
As you can see there are three tags here:
<ruby>
— this wraps everything.<rp>
— this indicates whether the “ruby text” (the top bit) should have parenthesis when the browser doesn’t support ruby… It’s weird. Honestly, you can kind of ignore this one.<rt>
— “ruby text”. This is the actual annotation content.
国語
So a simplified <rp>
-less version is quite simple:
<ruby>国語
<rt>こくご</rt>
</ruby>
So yeah, it’s basically like glossing. In fact, you can even use CSS to put the ruby below the glossed text, just like we do in linguistics, see here. Except, browser support is super duper bad for that.
This all sure looks like it should be usable for interlinear glossing on the web. And I guess it is, in a way. But personally I don’t think it’s the right match for us. There are several problems:
It implies a ‘one-gloss’ model.
In linguistics we have glosses with a ton of annotations, not just one. What if you want to use more than one orthography, for instance?
dzümle-si-ni
𝔊𝔢𝔰𝔞𝔪𝔱𝔥𝔢𝔦𝔱=𝔦𝔥𝔯𝔢=𝔡𝔦𝔢
Gesamtheit=ihre=die
Finck, Franz Nikolaus. 1909. Die Haupttypen des Sprachbaus, . Leipzig: B.G. Teubner.
Which one gets the <rt>
? You could imagine many more similar situations, with tone, say.
The annotated text doesn’t get its own node.
In the example above, what is presumably the “baseline” (dzümle-si-ni) doesn’t get its “own” tag at all — it’s just kind of floating there. Obviously you could wrap that in a <span lang=tr>
or something, but at that point you’re creating markup to target via CSS
anyway. So just… use your own markup.
Browser support is lousy.
This is the real deal breaker; browser supports for <ruby>
and friends isn’t great. It’s in the HTML standard, but that doesn’t mean much if it’s not suppored “in the wild”.
There are, by the way, other perfectly cromulent ways to format glosses in HTML, but that’s another topic.
Anyways. Some HTML blather for your consideration.
Also I just discovered that there is this thing. Good grief, someone else write that post.