AI moderation misses most African languages — English Level B2

AI systems that remove harmful content on social media often lack understanding of the continent’s many languages. Moderators report a mismatch between the languages people speak and the languages these tools can process, which affects what content stays online and what is taken down for millions of users.

A 2025 study found only 42 African languages appear in major language models and only four are handled consistently:

Amharic
Swahili
Afrikaans
Malagasy

Because training data is dominated by English, moderation systems make both false positives and false negatives: content can be removed without clear explanation, while harmful posts in low-resource languages can remain. Real cases illustrate the risks. Between January and March 2025 TikTok removed more than 450,000 videos from Kenya and banned over 43,000 accounts; by Q2 removals reached 592,000. False claims also spread in Ethiopia before fact-checkers debunked them.

Researchers and civil society are working to close the gap. Groups like AfricaNLP, academic teams in Pretoria, Nairobi and Addis Ababa, and collaborations such as Cohere with HausaNLP are building datasets and improving models. The AU approved a Continental AI Strategy in July 2024 and national strategies followed, including Nigeria’s in April 2025. European rules—the EU AI Act (in force August 2024) and the Digital Services Act (February 2024)—create pressure for greater transparency and non-discrimination, but building representative training data and operational coverage remains a practical challenge.

Difficult words

moderation — process of checking and removing online content

false positive — content wrongly identified as harmful

false positives

false negative — harmful content not detected by the system

false negatives

low-resource language — language with limited digital data available

low-resource languages

training data — examples used to teach AI models

transparency — openness about how systems and rules work

debunk — show that a claim or story is false

debunked

Tip: hover, focus or tap highlighted words in the article to see quick definitions while you read or listen.

Discussion questions

How could limited support for many African languages in moderation tools affect daily online life for users in those countries?

What practical difficulties do you think researchers face when building representative training data for many languages? Give one or two examples.

What actions could platforms or governments take to reduce wrongful removals while also preventing harmful posts in low-resource languages?

15 Apr 2026

Editors launch 'Don't ask AI, ask a peer' series

Editors from three organisations launch a series to value human creativity and peer knowledge sharing in the age of AI. The project will publish stories edited and translated by people throughout April on three websites.

Level

Read

27 Nov 2025

New AI tools for tuberculosis shown at lung health conference

Researchers presented four new AI approaches for detecting and monitoring TB at the Union World Conference on Lung Health in Copenhagen (18–21 November). The tools include breath analysis, cough screening, vulnerability mapping and a chest X‑ray tool for children.

Level

Read

6 Aug 2025

Nate’s Everest Roam to Support Global Voices

A longtime supporter, Nathan Matias (Nate), will ride an Everest Roam challenge to raise funds for Global Voices. The Nepalese community made Nepali and general playlists to support him while the organisation pursues a USD 250,000 goal.

Level

Read

20 Dec 2025

Why Rechargeable Batteries Lose Performance

Researchers found that repeated charging and discharging makes batteries expand and contract, causing tiny shape changes and stress. This “chemomechanical degradation” and spreading strain reduce performance and shorten battery life, and imaging revealed how it happens.

Level

Read

29 Dec 2025

New training method helps models do long multiplication

Researchers studied why modern language models fail at long multiplication and compared standard fine-tuning with an Implicit Chain of Thought (ICoT) method. ICoT models learned to store intermediate results and reached perfect accuracy.

Level

Read

AI moderation misses most African languages^{CEFR B2}

Difficult words

Discussion questions

Related articles

Editors launch 'Don't ask AI, ask a peer' series

New AI tools for tuberculosis shown at lung health conference

Nate’s Everest Roam to Support Global Voices

Why Rechargeable Batteries Lose Performance

New training method helps models do long multiplication

AI moderation misses most African languages CEFR B2

Difficult words

Discussion questions

Related articles

Editors launch 'Don't ask AI, ask a peer' series

New AI tools for tuberculosis shown at lung health conference

Nate’s Everest Roam to Support Global Voices

Why Rechargeable Batteries Lose Performance

New training method helps models do long multiplication

AI moderation misses most African languages^{CEFR B2}