LingVo.club
📖+40 XP
🎧+25 XP
+45 XP
AI models encode real-world plausibility — Level B2 — Ai letters on a glowing orange and blue background

AI models encode real-world plausibilityCEFR B2

26 Apr 2026

Adapted from Brown University, Futurity CC BY 4.0

Photo by Zach M, Unsplash

Level B2 – Upper-intermediate
5 min
242 words

Researchers from Brown University presented a study at the International Conference on Learning Representations in Rio de Janeiro, Brazil that probed whether language models encode knowledge about the real world. Michael Lepori, the PhD candidate who led the work, reports "some evidence that language models have encoded something like the causal constraints of the real world," and that the models' internal representations predict human plausibility judgements.

The team designed an experiment with sentences of varying plausibility — commonplace items like "Someone cooled a drink with ice," improbable examples such as "Someone cooled a drink with snow," impossible cases like "Someone cooled a drink with fire," and nonsensical lines like "Someone cooled a drink with yesterday." Using mechanistic interpretability, described by the authors as "neuroscience for AI systems," the researchers examined the models' internal mathematical states to see what the systems encode.

The study tested several open-source models, including Open AI's GPT 2, Meta's Llama 3.2 and Google's Gemma 2. They found that sufficiently large models developed distinct internal vectors that corresponded to plausibility categories and could even separate similar categories, such as improbable versus impossible, with roughly 85% accuracy. These vectors also reflected human uncertainty on ambiguous statements and began to appear in models with more than 2 billion parameters, a size small compared with today's trillion-plus-parameter systems.

  • Mechanistic interpretability can reveal what models encode.
  • Vectors map to human plausibility judgments.
  • Findings may aid development of smarter, more trustworthy models.

Difficult words

  • encodestore information in a different form
    encoded
  • plausibilityhow likely or believable something seems
  • mechanistic interpretabilitystudy of internal mechanisms in machine learning models
  • representationinternal model state that stores information
    representations
  • vectora mathematical list of numbers used by models
    vectors
  • parametera numeric value that controls model behavior
    parameters
  • causal constrainta rule linking cause and effect in real world
    causal constraints

Tip: hover, focus or tap highlighted words in the article to see quick definitions while you read or listen.

Discussion questions

  • How could knowledge of internal vectors help developers make language models more trustworthy?
  • What risks or limitations might remain even if models encode causal constraints?
  • Given that these vectors appeared in models over two billion parameters, how should teams balance model size and explainability when choosing a model?

Related articles