Monthly tool tinker/showcase Zoom?

Just a random thought to see if we have many/any takers. Fresh off attending my monthly short story bookclub, wondering if anyone would be interested in a monthly Zoom meetup for tinkering with some “NLP tool”.

The idea would be to showcase some tool that people were interested in working with and like a book club we:

  1. Vote on the next month’s tool
  2. The session leader (I expect this to be mostly me) does some ground work here and there during the month, like create a Colab with the tool and right dependencies installed
  3. We spend maybe an hour or so tinkering with the Colab with screen share (perhaps in breakout rooms if there’s too many people!)

I plan for this to be a mainly social and super-low stakes gathering from which we might even just build:

  1. a catalogue of tools and
  2. minimal working examples for people to get started with

Hoping this would partly help with the issue of “so much tooling has been developed for low resource NLP but so little is accessible” problem @SarahRMoeller and @cbowern.

Happy to hear thoughts and especially candidate tools to vote on!

4 Likes

I’d be up for this! Teaching field methods this spring

I’d be in for something like this!

Great idea @fauxneticien! I think one on setting up the morphological parser in FLEx might be appreciated. I’m glad to volunteer my sketchy knowledge there.

Also, I’m teaching a project based class this spring where students have an option to prepare a tutorial for some technical skill. So let me know if you want volunteer leaders later this spring.

I’d be interested in this!

Hi all — happy New Year! Glad to hear that there’s some interest!

To kick things off let’s kick off with some polling for:

  1. a date/time for the first meeting
  2. a tool to demo

For times for 1, I think for a synchronous Zoom, we might just be able to make the timezones work for @cbowern , @SarahRMoeller (US Eastern), me (US Pacific), and @skalyan (AU Brisbane). (Assuming somewhere in the US for @jrrabbit based on profile/intro post).

I think we might be looking at something like: 6 pm US Eastern, 3 pm US Pacific, next day 9 am AU Brisbane: The World Clock Meeting Planner - Details (arbitrarily picked February 14/15 as a date).

Action item: can each of you respond with whether this time configuration might work and if so a list of US dates (Siva it will be +1 for you, e.g. February 15 for US 14).

Action item: can each of you suggest a tool or add your votes to another tool already suggested?

To keep things simple since there’s only 5 of us so far, here’s a (publicly-)editable Google Sheet:

Just add dates/tools and change the numbers.

1 Like

Okay since vote for date is tied, let’s do February 21st 3 pm Pacific (6 pm Eastern) (February 22, 9 am AU).

Looks like the tool choice is also tied so I’m going to pick pympi. I’ll send out a calendar invite with a Zoom link to everyone’s e-mails and post a public link in this thread closer to the date.

Talk to y’all soon!

Would you post an announcement on the SIGEL mailing list? Or may I?

Would you post an announcement on the SIGEL mailing list? Or may I?

Could you send that out on SIGEL @SarahRMoeller ? Thanks!

Hi! I saw this on the SIGEL list. I am not sure that I’m available the 24th but I would like to help out. I did a lot of interesting things with pympi for the Talking Michif Dictionary and the source is available here: GitHub - p2wilrc/mtd-michif: Electronic version of the Turtle Mountain Michif Dictionary

For a future session with pyfoma I could also chip in as I recently built a morphological analyzer with it and I could show you how to support rhythmic templates with reduplication :slight_smile:

2 Likes

Another interest of mine which would be cool to look at would be tools for extracting information from PDFs, how to deal with idiosyncratic font encodings, processing tables… things that can be very handy when working with existing language documentation.

3 Likes

Some possible topics related to the above:

  • How text is encoded (or not) in a PDF
  • How to use fontforge to view fonts and encodings in a PDF
  • Correcting encodings using convertextract
  • Extracting tables and figures with docling (yes … they stole our name … and please ignore “gen AI” in the description of this tool, it is also very good at converting PDF to HTML :wink:)
  • Visualizing PDF objects and metadata in a Colab notebook
1 Like

Hi! I saw this on the SIGEL list. I am not sure that I’m available the 24th but I would like to help out.

@dhdaines — do you mean the 21st?

I did a lot of interesting things with pympi for the Talking Michif Dictionary and the source is available here

Thanks! I’ll take a look, but I assume the raw data itself isn’t there or openly available right? I was going to source some DoReCo data for demos.

In regard to the availability of “raw data” for Michif, please contact me to discuss. Or, maybe David (Huggins-Daines) can contact me!

If you will be using DoReCo for this–perhaps for other apps that may be particularly useful to those working on polysynthetic or contact languages, please consider including data from Warlpiri and Light Warlpiri.
Kihchi-marsii! Thank you!

Hi! Oops, yes I mean the 21st!

Hi Heather! I probably can’t be there for the pympi discussion on the 21st - but I will contact you :slight_smile:

Hi all — Zoom info for those who didn’t already have the calendar invite:


Hi there,

Nay San is inviting you to a scheduled Zoom meeting.

Topic: LangDoc Monthly Tool Tinker
Time: Feb 21, 2025 03:00 PM Pacific Time (US and Canada)

Join from PC, Mac, Linux, iOS or Android: Launch Meeting - Zoom
Password: 152561

Or iPhone one-tap (US Toll): +18333021536,99101258351# or +16507249799,99101258351#

Or Telephone:
Dial: +1 650 724 9799 (US, Canada, Caribbean Toll) or +1 833 302 1536 (US, Canada, Caribbean Toll Free)

Meeting ID: 991 0125 8351
Password: 152561
International numbers available: https://stanford.zoom.us/u/anlWeu6V3

Meeting ID: 991 0125 8351
Password: 152561
SIP: 99101258351@zoomcrc.com
Password: 152561

Ah, I’m so sorry I missed it! I was recovering from a major deadline. Hope it went well, and I’d be interested to read a recap if anyone cares to share :slight_smile:

2 Likes

@lgessler — no worries. I think it went super well in that we kind of just hung out and chatted for perhaps an hour and a half and kind of didn’t get to doing the pympi demo. Hope you can join next time!

1 Like