Celtic Knot Conference 2017/Programme/CK129

From Wikimedia UK
Jump to navigation Jump to search
A quick introduction to Kathabhidhana.

Title: Kathabhidhana, open source toolkit to record pronunciations of any world language

Auditorium - University of Edinburgh Business School.

Date: 6 July 2017

Time: 3.00pm to 3.15pm.

Duration: 15 minute presentation by Subhashish Panigrahi and Prateek Pattnaik

Venue: University of Edinburgh Business School - Auditorium.


  • Subhashish Panigrahi - Asia Community Catalyzer at Mozilla. Community builder, author, public speaker and Wikimedian. Video presentation on the Kathabhidhana project, an open toolkit for anyone to record their language in a human and machine readable form.
  • Prateek Pattanaik is a high school student and a young researcher and archivist. He has been digitising some of the oldest and rarest texts of the Odia language, and growing the Odia Wikisource with public domain text with his digitisation project called 'Pothi'. As a researcher of classical poetry, mythology and music of Odisha, he has been preserving ancient books providing invaluable material for linguistic research. Aside from these, he is also actively working towards building digital tools for language preservation.

Overview of topic:
Kathabhidhana is an open toolkit for anyone to record their language in a human and machine readable form. It is a collection of open source tools, educational material, and open sample datasets. It not helps one to record their language but helps creating resources that can be used for building Machine Learning and Natural language Processing tools. I have personally recorded over 2000 words in my native language Odia. More about this toolkit are summarized in a quick video.

Notes: Etherpad link.

Supporting material:

Related sessions: