Using Transkribus for Tibetan OCR

An interesting article on using the Transkribus to OCR 11th-13th C. Tibetan texts… in cursive manuscripts!

I’m not sure if this scan is from the collection in question, but it might suffice as an indication of the kinds of texts they’re dealing with:


Y’all. Transkribus is nuts.

The “CER” or “Character Error Rate”s reported are… a little amazing:

Model nameNo. of pages checkedCER% for Training SetCER% for Validation Set
Model A401.39%4.28%
Model B801.35%4.45%
Model C1201.18%4.73%
Model D1601.15%2.33%

There’s a nice video about the project here:

1 Like