AI models encode real-world plausibility — English Level B2

Researchers from Brown University presented a study at the International Conference on Learning Representations in Rio de Janeiro, Brazil that probed whether language models encode knowledge about the real world. Michael Lepori, the PhD candidate who led the work, reports "some evidence that language models have encoded something like the causal constraints of the real world," and that the models' internal representations predict human plausibility judgements.

The team designed an experiment with sentences of varying plausibility — commonplace items like "Someone cooled a drink with ice," improbable examples such as "Someone cooled a drink with snow," impossible cases like "Someone cooled a drink with fire," and nonsensical lines like "Someone cooled a drink with yesterday." Using mechanistic interpretability, described by the authors as "neuroscience for AI systems," the researchers examined the models' internal mathematical states to see what the systems encode.

The study tested several open-source models, including Open AI's GPT 2, Meta's Llama 3.2 and Google's Gemma 2. They found that sufficiently large models developed distinct internal vectors that corresponded to plausibility categories and could even separate similar categories, such as improbable versus impossible, with roughly 85% accuracy. These vectors also reflected human uncertainty on ambiguous statements and began to appear in models with more than 2 billion parameters, a size small compared with today's trillion-plus-parameter systems.

Mechanistic interpretability can reveal what models encode.
Vectors map to human plausibility judgments.
Findings may aid development of smarter, more trustworthy models.

Difficult words

encode — store information in a different form

encoded

plausibility — how likely or believable something seems

mechanistic interpretability — study of internal mechanisms in machine learning models

representation — internal model state that stores information

representations

vector — a mathematical list of numbers used by models

vectors

parameter — a numeric value that controls model behavior

parameters

causal constraint — a rule linking cause and effect in real world

causal constraints

Tip: hover, focus or tap highlighted words in the article to see quick definitions while you read or listen.

Discussion questions

How could knowledge of internal vectors help developers make language models more trustworthy?

What risks or limitations might remain even if models encode causal constraints?

Given that these vectors appeared in models over two billion parameters, how should teams balance model size and explainability when choosing a model?

6 Dec 2025

People with AMD Judge Car Arrival Times Like Others

A virtual reality study compared adults with age-related macular degeneration (AMD) and adults with normal vision. Both groups judged vehicle arrival times similarly; vision and sound were used together, and a multimodal benefit did not appear.

Level

Read

2 Dec 2025

A display you can both see and feel

Researchers at the University of California, Santa Barbara created a display that produces images people can see and also feel. The work, led by Max Linnander and Yon Visell, appears in Science Robotics.

Level

Read

1 Dec 2025

AI coach helps medical students learn suturing

Researchers at Johns Hopkins developed an explainable AI tool that gives immediate text feedback to medical students practicing suturing. A small randomized study found faster learning for students with prior experience; beginners showed less benefit.

Level

Read

28 Dec 2025

Algae-based synthetic gel supports mammary tissue growth

In 2020 a PhD student and her adviser at UC Santa Barbara developed an algae-based synthetic membrane to support mammary epithelial cells. Their tunable gel, reported in Science Advances, can direct cell growth by changing mechanical and biochemical cues.

Level

Read

26 Mar 2026

Reducing unsafe responses in large language models

Researchers studied how large language models (LLMs) handle safety and tested training methods to reduce unsafe outputs while keeping performance. They identified key challenges and a technique that preserves safety during fine-tuning.

Level

Read

AI models encode real-world plausibility^{CEFR B2}

Difficult words

Discussion questions

Related articles

People with AMD Judge Car Arrival Times Like Others

A display you can both see and feel

AI coach helps medical students learn suturing

Algae-based synthetic gel supports mammary tissue growth

Reducing unsafe responses in large language models

AI models encode real-world plausibility CEFR B2

Difficult words

Discussion questions

Related articles

People with AMD Judge Car Arrival Times Like Others

A display you can both see and feel

AI coach helps medical students learn suturing

Algae-based synthetic gel supports mammary tissue growth

Reducing unsafe responses in large language models

AI models encode real-world plausibility^{CEFR B2}