Tuesday, 02 January 2024 12:17 GMT

How AI Could Help Safeguard Indigenous Languages


Author:Anna Luisa Daigneault
(MENAFN- The Conversation) If there are few speakers left of a language, how does a community revive it? In our current era, 3,000 languages are at risk of extinction due to the pressures of colonization, globalization, forced cultural assimilation, environmental devastation and other factors.

According to Canada's Commission for Indigenous Languages ,“research shows that no Indigenous language in Canada is safe and that all are in varying stages of endangerment.”

Our society is also being shaped by the rapid rise of artificial intelligence. Can AI be used for the benefit of Indigenous language survival in Canada and elsewhere?

According to the World Economic Forum , most AI chatbots are trained on 100 of the world's 7,000 languages. English is the main driver of most large language models.

This scenario leaves the bulk of the world's languages in the dust. In the coming years, will AI contribute to language revitalization, or language oppression?

A language in a box

In a 2023 TEDx talk , Northern Cheyenne computer engineer Michael Running Wolf shared his design of a cedar box that looks both ancient and contemporary. He described the dragonfly-adorned device as a“cedar-enclosed, offline Edge AI that contains the inner workings of a minimal voice-based language curricula - in other words, a language in a box.”

He proposed that conversational AI technology, much like Amazon Alexa or Google Home, could help language learners improve their fluency.

Running Wolf is the technical director of the First Languages AI Reality initiative at the Québec Institute for Artificial Intelligence. The program propels Indigenous scholars and technologists towards creating innovative solutions regarding language loss.


A TEDx Talk by Michael Running Wolf on how AI can assist Indigenous langauge learning.

Voice-controlled tools trained via machine learning could serve as AI assistants for speakers who wish to hear unfamiliar sounds pronounced accurately, and practice their own pronunciation. This technology could establish a new means for facilitating oral transmission, which is crucial when there are few fluent speakers left.

At the heart of Running Wolf's project is Indigenous data sovereignty , which ensures that Indigenous people retain control over their data.

A place in the digital world

Around the world in the Philippines, AI scholar and politician Anna Mae Yu Lamentillo is on a quest to support the Indigenous languages of her home country. She created NightOwlGPT , a new AI-powered translation app.

In an email to me, Lamentillo wrote:


NightOwlGPT creator Anna Mae Yu Lamentillo. (Arwin Doloricon)

We have seen that in the hands of the powerful, AI software can lead to oppressive forms of control, such as excessive AI-powered surveillance by Amazon and the U.S. government's unethical data mining tactics.

When it comes to the survival or extinction of languages, it is important to question the power behind AI tools. Who controls them, and who benefits from them?

When I asked about the democratization of AI, Lamentillo noted the need for inclusivity:

Diversity of voices
Linguistics professor Emmanuel Ngué Um. (Emmanuel Ngué Um)

At a recent workshop series on endangered languages , Emmanuel Ngué Um, a professor of linguistics at the University of Yaoundé I in Cameroon, spoke on behalf of a research team of African linguists .

They are currently using Mozilla's Common Voice platform to create open-source datasets containing thousands of words and audio recordings in 31 African languages.

The platform aims to make speech recognition and voice-based AI more inclusive by crowd-sourcing a massively multilingual speech corpus. But this process is not without significant challenges in Africa.

Ngué Um noted that building datasets for languages with many dialects is not straightforward. There may not be a standardized spelling or pronunciation that should be used by AI as the accepted norms for the language.

Because of postcolonial changes, many African languages do not have one unified or agreed-upon writing system. This issue can slow the creation of teaching tools, but many local efforts backed by UNESCO are underway to change this.

So, how do automatic speech recognition tools deal with dialectical diversity? And how do text-to-speech models handle competing writing systems?

As Ngué Um wrote in an email to me:

It is clear that AI engineers and computational linguists need to integrate thoughtful approaches that take into account unique circumstances of languages.

In the not-too-distant future, using AI tools to learn and communicate in under-resourced languages may become the norm. However, that shift depends on financial backing, accurate training data for machine learning, and community desire to embrace AI. Ultimately, data sovereignty and equitable access must be at the core of AI tools.


The Conversation

MENAFN11052025000199003603ID1109534130



The Conversation

Legal Disclaimer:
MENAFN provides the information “as is” without warranty of any kind. We do not accept any responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you have any complaints or copyright issues related to this article, kindly contact the provider above.

Search