Text to Speech anyone?

Aidan Pine and team (in addition to their Gi2Pi paper) also have an ACL paper on text to speech. https://aclanthology.org/2022.acl-long.507.pdf TTS is something I’ve been wondering about for ages, for a couple of applications. One is that the writing systems of several Aboriginal languages are a barrier for their communities, but we don’t have the resources to record all the words in dictionaries from fluent speakers. Giving a pronunciation indication through TTS would be incredibly useful, particularly for communities where Aboriginal English and/or Kriol are widely used. Does anyone want to work together on creating a TTS implementation for the language(s) you work on?

3 Likes

At the University of Groningen (the Netherlands), we are working on speech technology for Gronings (a Low Saxon language variant spoken in the North of the Netherlands).

We’re currently working on TTS systems for each of the variants of Gronings (also using fastspeech2, which is mentioned in the paper of Aidan Pine and team). These systems will be embedded in our community-oriented online platform on the documentation of Gronings (see: https://www.woordwaark.nl/, a new version is being developed).

It’d be very useful in providing a pronunciation indication, and would complement our course and mobile app developed for children in primary schools to introduce them to their local dialect.

@fauxneticien and I are working on other applications of these systems to improve access to untranscribed speech of Gronings. The work is still in its early stages, but I would be happy to share any details on this if you are interested!

Best,
Martijn Bartelds

3 Likes

Welcome aboard, @bartelds, happy to have you and looking forward to hearing more about your work!

1 Like

Hi,
I’ve been considering for the community I advise. However, it is in the context of revitalization and reclamation of a critically endangered language in southern Mexico. So, the challenges for language technologies in critically endangered languages are a complex obstacle. I’d like to see if it is even viable, specially because we will be working on a community dictionary in 2022-2023. We developed a practical orthography in 2019, so it could be a good test. So, yes, I’d like to know more and will be interested in giving a try. Let me know what’s next.

2 Likes

I’m resurrecting this 2022 (!!) topic because Alessio Tosolini and I are giving a talk at the LSA in January on our TTS training experiences, and I’d love to hear more about your experiences of TTS in the last few years. The crux of our paper is that the technology is now good enough that you can make realistic voices from general language documentation recordings. However, because this works best with single speaker materials, in communities where everyone knows everyone else, this brings us very close to (if not actually into) the territory of “deep fakes”, and it raises huge ethical issues in a bunch of ways (e.g. training on one language to create resources for another language that provides an identifiable voice of someone who doesn’t actually speak the language the resource is for, and where they gave permission for materials to be used to help other communities but this technology didn’t exist at the time).

3 Likes