African development action with informed and engaged societies
As of March 15 2025, The Communication Initiative (The CI) platform is operating at a reduced level, with no new content being posted to the global website and registration/login functions disabled. (La Iniciativa de Comunicación, or CILA, will keep running.) While many interactive functions are no longer available, The CI platform remains open for public use, with all content accessible and searchable until the end of 2025. 

Please note that some links within our knowledge summaries may be broken due to changes in external websites. The denial of access to the USAID website has, for instance, left many links broken. We can only hope that these valuable resources will be made available again soon. In the meantime, our summaries may help you by gleaning key insights from those resources. 

A heartfelt thank you to our network for your support and the invaluable work you do.
Time to read
3 minutes
Read so far

Using automatic speech recognition to transcribe voice messages from African farmers

0 comments
Image
Your Blog

Author: Farm Radio International, March 6 2024 - How do you gather qualitative information from thousands of people, quickly and at scale? 

That's a question Farm Radio International, a communication for development organization that uses interactive radio as its primary tool to reach and amplify the voices of rural Africans, has an answer to: On Air Dialogues. 

This on-air polling process uses Interactive Voice Response technology in tandem with Farm Radio's Uliza Interactive suite of services. Radio programs spark discussion about topics, then listeners call Uliza Interactive to leave their responses - both qualitative and quantitative - to questions. 

Though successful - during a 2022 On Air Dialogue about climate change, 14,356 respondents left 9,317 audio comments - there is one ongoing challenge: the volume of responses means that our team of staff and volunteers does not have the capacity to manually transcribe all responses. 

To complicate matters, answers may be recorded in a variety of local languages and dialects, which means that the transcriber must be fluent in both the language of recording and the language of transcription (English or French). This means that making full use of this rich data set is resource-intensive, and listeners' contributions, in all their variety, are not all being considered and acted upon.

The solution: Training automatic speech recognition on the Luganda language

In 2022, we partnered with the International Food Policy Research Institute (IFPRI), a research centre of CGIAR, to test the use of automatic speech recognition (ASR) to transcribe voice recordings from listeners. ASR is a computer’s use of machine learning or artificial intelligence (AI) to process human speech in a written format. Learn more about the partnership and research project in CGIAR's blog post.

ASR models tend to be more familiar with languages from the Global North due to limited training data and funding for training AI on African languages. This means that before we can deploy ASR for transcription in our projects, we need to train the tool on the languages typically spoken in the recordings. In practice, that means getting the tool to transcribe many recordings in a specific language so that it becomes familiar with the language and our team can correct any errors it makes.

Our Digital Innovation team is using an open-source ASR model. Farm Radio International with support from the CGIAR Initiative on Digital Innovation has been training the model using Luganda, a popular language in Uganda, because we have a large data set available in this language. The recordings were submitted to our partner Radio Simba for our project about Nature-based Solutions to climate change, mainly through On Air Dialogues.

For a typical radio episode, Radio Simba might receive around 17,000 voice recordings over the course of two days. It would take our team weeks or months to process all the recordings, which would delay analyzing the data and may not be the best use of our time. Using AI allows us to automate this process and make it more efficient. That way, our team can focus on data analysis, broadcaster training and other aspects of project implementation.

"As technology advances, artificial intelligence is becoming increasingly pervasive and has revolutionized sectors such as agriculture, healthcare and transportation," said Farm Radio International's Digital Innovation manager, Nathaniel Ofori. "The use of ARS has proved to be a game changer in amplifying the voices of rural communities."

How it works

Unlike manual transcription, the ASR model transcribes directly from Luganda audio to Luganda text. AI does not replace manual transcription, but rather complements it. Though using ASR is faster, the results are not always accurate. The accuracy of the transcriptions varies based on factors like audio quality, the speaker's accent and background noise. Transcribers check a random selection of the machine-generated transcriptions and make edits if required to ensure consistency with the recording.

When selecting recordings for training the tool, we ensure that the audio is audible and that the recording has the right topic and meets the technical requirements. We train the model using recordings by both women and men to ensure that a diversity of voices are heard and that the model learns to recognize different voices.

The ASR model is not yet perfect. Because Luganda is a member of the Bantu language family - which contains hundreds of distinct languages - the tool sometimes gets confused. This means that we need to continue perfecting the model with more recordings in Luganda.

Looking forward: Using automatic speech recognition to amplify African voices

Our Digital Innovation team's current focus is improving the ASR model using the Luganda language. Moving forward, we want to continue improving the model using new data sets and make the model more robust. Then, we will be able to roll out the technology in one of our projects. We are seeking investments so that we can continue to develop this model, and then expand to other languages.

ASR is an exciting opportunity to transcribe thousands of recordings from rural Africans in real time, making the process easier for our staff and volunteers and allowing us to analyze larger data sets. Ultimately, this tool will contribute to our mission of making radio a powerful force for good in rural Africa - one that shares knowledge, amplifies voices and supports positive change. By listening to and amplifying the voices of those who are rarely heard from - small-scale farmers in Africa - we can influence policy and decision makers and move the world one step closer to a sustainable, equitable future.

About Farm Radio International

Farm Radio International is a communication for development organization that uses interactive radio to get quality information to and amplify the voices of rural Africans. We design and develop projects that achieve specific development outcomes like better agricultural practices, climate change adaptation or improved health behaviours. We also support a network of more than 1,300 radio stations in 38 sub-Saharan African countries with training and content resources like scripts and backgrounders.

In 2022-2023, the radio programs broadcast through our projects had a potential audience of 60 million listeners. Of those, 4.8 million listeners improved their farming, health or nutrition practices based on what they heard on the radio programs.

As with all the blogs posted on our website, the content above does not imply the endorsement of The CI or its Partners and is from the perspective of the writer alone. We do not check facts and strive to retain the writer's voice, as is detailed in our Editorial Policy.