LingVo.club
📖+40 XP
🎧+25 XP
+45 XP
AI moderation misses most African languages — Level B2 — Tiktok logo on a dark keyboard background

AI moderation misses most African languagesCEFR B2

20 Apr 2026

Adapted from Guest Contributor, Global Voices CC BY 3.0

Photo by Zulfugar Karimov, Unsplash

Level B2 – Upper-intermediate
4 min
232 words

AI systems that remove harmful content on social media often lack understanding of the continent’s many languages. Moderators report a mismatch between the languages people speak and the languages these tools can process, which affects what content stays online and what is taken down for millions of users.

A 2025 study found only 42 African languages appear in major language models and only four are handled consistently:

  • Amharic
  • Swahili
  • Afrikaans
  • Malagasy

Because training data is dominated by English, moderation systems make both false positives and false negatives: content can be removed without clear explanation, while harmful posts in low-resource languages can remain. Real cases illustrate the risks. Between January and March 2025 TikTok removed more than 450,000 videos from Kenya and banned over 43,000 accounts; by Q2 removals reached 592,000. False claims also spread in Ethiopia before fact-checkers debunked them.

Researchers and civil society are working to close the gap. Groups like AfricaNLP, academic teams in Pretoria, Nairobi and Addis Ababa, and collaborations such as Cohere with HausaNLP are building datasets and improving models. The AU approved a Continental AI Strategy in July 2024 and national strategies followed, including Nigeria’s in April 2025. European rules—the EU AI Act (in force August 2024) and the Digital Services Act (February 2024)—create pressure for greater transparency and non-discrimination, but building representative training data and operational coverage remains a practical challenge.

Difficult words

  • moderationprocess of checking and removing online content
  • false positivecontent wrongly identified as harmful
    false positives
  • false negativeharmful content not detected by the system
    false negatives
  • low-resource languagelanguage with limited digital data available
    low-resource languages
  • training dataexamples used to teach AI models
  • transparencyopenness about how systems and rules work
  • debunkshow that a claim or story is false
    debunked

Tip: hover, focus or tap highlighted words in the article to see quick definitions while you read or listen.

Discussion questions

  • How could limited support for many African languages in moderation tools affect daily online life for users in those countries?
  • What practical difficulties do you think researchers face when building representative training data for many languages? Give one or two examples.
  • What actions could platforms or governments take to reduce wrongful removals while also preventing harmful posts in low-resource languages?

Related articles

Why Rechargeable Batteries Lose Performance — Level B2
20 Dec 2025

Why Rechargeable Batteries Lose Performance

Researchers found that repeated charging and discharging makes batteries expand and contract, causing tiny shape changes and stress. This “chemomechanical degradation” and spreading strain reduce performance and shorten battery life, and imaging revealed how it happens.