Federico Germani and Giovanni Spitale tested four LLMs: OpenAI o3-mini, Deepseek Reasoner, xAI Grok 2 and Mistral. Each model generated fifty narrative statements on 24 controversial topics such as vaccination mandates, geopolitics and climate change policies. The team then asked the models to evaluate the same texts under different source conditions.
When no author information was given, agreement across models was over 90%. But adding fictional authors changed the results: agreement fell sharply and a clear anti-Chinese bias appeared. The researchers collected 192’000 assessments and warn that these hidden biases matter for real applications. They recommend transparency, governance and using LLMs to assist reasoning, not to replace it.
Difficult words
- researcher — A person who studies or investigates something.Researchers
- evaluate — To judge or calculate the value or quality.
- bias — An unfair preference or dislike.biases
- nationality — The status of belonging to a specific nation.
- concerns — Worries or issues that need attention.
- judgments — Decisions about someone or something.
- moderation — The process of managing or controlling content.
Tip: hover, focus or tap highlighted words in the article to see quick definitions while you read or listen.
Discussion questions
- Why is it important to consider an author's background?
- How can biases in AI affect hiring decisions?
- In what other areas might AI evaluation cause problems?
Related articles
New oral drug approved for sleeping sickness
European regulators approved Acoziborole, a single-dose oral treatment for sleeping sickness after successful trials in the DRC and Guinea. Sanofi will donate doses to WHO, but national approvals and vector control are still needed to reach elimination goals.
Wearable 10‑Minute Antibody Sensors from University of Pittsburgh
Researchers at the University of Pittsburgh made a wearable biosensor that detects antibodies in interstitial fluid in 10 minutes without a blood draw. The tiny carbon nanotube sensors are highly sensitive and the work appears in Analytical Chemistry.