Myanmar in Turbulent Times: The Challenges of Translating Burmese with Greater Accuracy

Track: Multilingual AI | TA4 |
Wednesday, June 9, 2021, 2:30pm – 3:15pm
Held in: Jujama 2
Vassilis Korkas - lexiQA

Burmese is a low-resource language that is spoken by 50 million people. Among the peculiarities of the language are the nonstandard romanization, the limited use of English/Western loan words, and the preferential use of spacing and punctuation. Moreover, the lack of high-quality linguistic data has been a challenge for natural language processing (NLP) and machine translation (MT) research, while the recent switch from the traditional Zawgyi font to Unicode has not been completed yet it is in practice. In these turbulent times for Myanmar, could a hybrid approach (NLP + human-in-the-loop) address the issues emerging when translating user-generated content, which includes fake news and hate speech?

Takeaways: Attendees will get an overview of the current challenges for NLP and MT in Burmese, and specific examples of how some of these challenges can be addressed with a novel approach.