Researchers Federico Germani and Giovanni Spitale at the University of Zurich tested four widely used LLMs: OpenAI o3-mini, Deepseek Reasoner, xAI Grok 2 and Mistral. First, each model generated fifty narrative statements on 24 controversial topics, including vaccination mandates, geopolitics and climate change policies. The team then asked the models to evaluate the statements under different conditions: sometimes no source was given, other times each text was attributed to a human of a certain nationality or to another LLM. The researchers collected 192’000 assessments.
With no source information, models agreed at a high level—agreement was over 90%—leading Spitale to say, “There is no LLM war of ideologies.” But when fictional sources were added, agreement fell sharply and hidden biases appeared. The most striking finding was a strong anti-Chinese bias across all models, including Deepseek. In geopolitical topics such as Taiwan’s sovereignty, Deepseek reduced agreement by up to 75% simply because it expected a Chinese person to hold a different view.
The study also found a tendency for LLMs to trust human authors more than other AIs. The researchers warn that these biases could affect content moderation, hiring, academic review or journalism, and they call for transparency and governance. They recommend using LLMs as assistants for reasoning, not as judges.
Difficult words
- bias — a tendency to favor one thing over another.biases
- evaluate — to judge or assess something.evaluating, evaluations
- identity — how someone or something is described or recognized.
- reveal — to make something known or visible.revealing
- transparency — openness and clarity about actions and decisions.
- trust — to believe in someone's reliability or truth.
- consequences — results or effects of an action.
Tip: hover, focus or tap highlighted words in the article to see quick definitions while you read or listen.
Discussion questions
- How can biases in AI be addressed effectively?
- What are the potential risks of using AI in decision-making?
- In what ways can transparency improve AI trustworthiness?
- How might AI's bias impact various social contexts?
Related articles
Ice storm damages power systems in eastern US
A massive winter storm with ice and freezing rain is hitting the eastern United States and has cut electricity for more than a million customers. Officials, utilities and researchers are working to restore power and study grid resilience.
Australian creators launch 'Stop AI Theft' campaign
Artists, journalists and Aboriginal cultural workers in Australia have launched the "Stop AI Theft" campaign. They say generative AI uses their work without permission, harms jobs and income, and they demand payment, rights and transparency.
Alternative splicing linked to mammal lifespan
A study in Nature Communications compared alternative splicing across 26 mammal species (lifespans 2.2–37 years) and found splicing patterns better predict maximum lifespan than gene activity; the brain shows many lifespan-linked events controlled by RNA-binding proteins.
New analysis: Titan may have a slushy interior, not a deep ocean
Reanalysis of Cassini data suggests Titan has a thick, slushy layer with tunnels and pockets of meltwater near a rocky core rather than a single deep ocean. The result could change ideas about habitability and will be testable by the Dragonfly mission.