Meta launches new program to improve speech and translation AI

Facebook
Twitter
LinkedIn
WhatsApp

Meta is launching a new program in partnership with UNESCO to collect speech recordings and transcriptions the company said will help the development of future openly available AI.

The program, the Language Technology Partner Program, is seeking collaborators who can contribute more than 10 hours of speech recordings with transcriptions, large amounts of written text, and sets of translated sentences in “diverse languages.” According to Meta, partners will work with the company’s AI teams to integrate these languages into AI speech recognition and translation models, which — when finalized — will be open-sourced.

Partners so far include the government of Nunavut, a sparsely populated territory in Northern Canada. Some residents of Nunavut speak Intuit languages collectively known as Inuktut.

“Our efforts are especially focused on underserved languages, in support of UNESCO’s work,” Meta wrote in a blog post provided to TechCrunch. “Ultimately, our goal is to create intelligent systems that can understand and respond to complex human needs, regardless of language or cultural background.”

Complementary to the new program, Meta said that it’s releasing an open source machine translation benchmark to evaluate the performance of language translation models. The benchmark, composed of sentences crafted by linguists, supports seven languages, and can be accessed — and contributed to — from the AI development platform Hugging Face.

Meta is framing both initiatives as philanthropic. But the company stands to benefit from upgraded speech recognition and translation models.

Meta continues to expand the number of languages its AI-powered assistant, Meta AI, supports, and pilot features such as automatic translation for creators. Last September, Meta announced that it would begin testing a tool to translate voices in Instagram Reels, allowing creators to dub their speech and auto-lip-sync it.

Meta’s treatment of content in languages other than English across its platforms has been the target of much criticism. According to one report, Facebook left almost 70% of Italian- and Spanish-language COVID misinformation unflagged compared to just 29% of similar English-language misinformation. And leaked documents from the company reveal that Arabic-language posts are regularly flagged erroneously as hate speech.

Meta has said that it’s taking steps to improve its translation and moderation technologies.

Leave a Comment

We use cookies to personalize content and ads, to provide social media features, and to analyze our traffic. We also share information about your use of our site with our social media, advertising, and analytics partners.