Get
Involved
What?
We want to use NLP and ML to:
1-Train a model to understand words/sentences and reproduce the structure of sentences in different low-resourced African languages, including Ghɔmálá and other Cameroonian languages.
2-Train a model to recognize words/sentences in those languages and be able to translate into and from them.
How?
Program Engage Africa is an initiative aimed at enhancing the presence and usability of low-resourced African languages, including Ghɔmálá and other Cameroonian languages in the realms of NLP and ML. This program is structured around three key projects (Project Mirror Africa, Project Embrace Africa and Project Include Africa), each with a distinct focus and set of objectives. We are currently working on the initial iteration of Project Mirror Africa.
Project Mirror Africa is currently underway for Ghɔmálá and other Cameroonian languages and comprises three main tasks:
Corpus Building
– Objective: To compile a comprehensive database, encompassing both textual and audio materials. This corpus is essential for understanding, preserving and promoting the linguistic intricacies and variability of the languages. It is the first step to all other activities.
– Activities: Collection, digitization and validation of written texts and audio recordings, and transcription of publicly transmitted oral traditions. This involves collaboration with the community, linguists, and cultural experts to ensure the corpus is representative, extensive and accurate.
AI-Powered Language Tools Building
– Objective: Based on the corpus built, to create AI-driven tools. These tools are designed to be user-friendly and accessible, promoting the use of the languages in digital spaces.
– Activities: Developing a virtual assistant for interactive language tasks, instant translation, speech recognition systems, and text-to-speech technology. These tools are geared towards both native speakers and those interested in learning or interacting with the languages.
Machine Translation Model Building
– Objective: To develop a machine translation model. This model aims to provide accurate and culturally sensitive translations, bridging the gap between the low-resourced languages and other languages.
– Activities: Utilizing the corpus to train AI algorithms, ensuring they can handle the nuances and contextual subtleties of the languages. The development also involves continuous testing and refinement to enhance accuracy.
Each of these tasks is intricately linked, with the corpus serving as the foundational element that drives the development of the AI-powered tools and the machine translation model.
When?
This is a long term project. Here’s a tentative outline for each language:
Initial Research (months)
Community Engagement and Partnership Formation (months)
Budget Evaluation and Ressource Preparation (months)
Digitization, Documentation, Data Collection (started, ongoing)
Technology Development and Iteration (months)
Evaluation and Expansion (years)
Who?
Program Engage Africa involves collaboration with community speakers, linguists, tech experts, cultural experts, archivists, localization and globalization specialists, language preservation advocates, and all stakeholders and interested parties to ensure the corpus is representative, extensive and accurate and the best available tools and experts are available to reach our goals.
The focus is on ensuring that these technological advancements are grounded in the cultural and linguistic context of the communities, thereby preserving and modernizing the language while maintaining its rich heritage.
Do you want to help?
How would you like to help?
Updates
The Importance of Bidirectional Data Capture for AI Solutions in African Languages
When collecting translated text to build AI solutions for African languages, some practitioners mistakenly believe that starting with a major language and translating as much content as possible into the target African language will provide sufficient bilingual text...
NLP’s Role in Supporting Low-Resourced African Languages
The field of Natural Language Processing (NLP) holds transformative potential for low-resourced African languages. NLP, the intersection of computer science, artificial intelligence, and linguistics, involves the programming of computers to process and analyze large...