This session will offer several quick and dynamic presentations covering topics and ideas too interesting to ignore:
Bitextor: Get the Translation Data You Need from the Web – Miguel Esplá-Gomis (Universitat d’Alacant)
In this session we will present a free, open source tool to harvest parallel data from the internet, which can be used as a translation memory or to train machine translation systems. This tool is being developed as part of the CEF Project. We will briefly cover how the tool works, and how it may be adapted to different use scenarios.
Takeaways: Attendees will get an awareness about the usefulness of web data for the translation industry and knowledge about an industry-mature, freely-available tool to crawl parallel data from the internet.
Hash ID: How We Fix UI/UA Translation Inconsistencies in One Click – Andrey Berestyansky (Kaspersky Lab)
Control names mentioned in the user assistance materials must match the UI precisely — this is a big factor contributing to the documentation quality. Verifying localized control names becomes difficult when you have to deal with multiple translation rounds performed by different vendors, various help formats and short release cycles. To overcome this, we have developed and implemented a technology we called Hash ID.