New training method helps models do long multiplication — English Level B2

New research explains why modern large language models struggle with a seemingly simple task: multiplying multi-digit numbers. The study examines how current training methods affect a model’s ability to store and reuse intermediate results, a capability required for long calculations and long-range dependencies such as holding partial products and running sums through many steps.

Researchers led by Xiaoyan Bai and Chenhao Tan at the University of Chicago, with collaborators from MIT, Harvard, the University of Waterloo and Google DeepMind, compared standard fine-tuning with an alternative training method called Implicit Chain of Thought (ICoT). Under standard fine-tuning, models with two to 12 layers achieved less than 1% accuracy on four-digit multiplication because they fell into a local optimum: they learned superficial patterns in the data but did not develop a mechanism to store intermediate values for later steps. By contrast, the ICoT-trained model reached 100% accuracy.

Probes of the models’ internal states showed that the ICoT model encodes intermediate values: the researchers could decode running sums from its hidden states. ICoT also organizes attention along distinct temporal pathways. Early layers compute and store digit-pair products at specific locations, while later layers retrieve those values to form each digit of the final answer. The team observed digit representations that use Fourier-like bases and a geometric operation similar to a Minkowski sum that emerged during training.

The authors then added a training objective that explicitly teaches a model to track running sums at each step. Applied to a two-layer model, this objective raised accuracy to 99% without explicit chain-of-thought supervision; the model developed attention mechanisms and new strategies for tracking multiple digit pairs similar to ICoT. The study highlights that scaling alone does not fix some limits and that architectural guidance and targeted objectives can enable multi-step reasoning. "As AI is increasingly integrated into critical decision-making, it’s essential to understand its unique ways of learning and thinking," says Tan.

Key mechanisms: encoded intermediate values
Distinct attention pathways across time
Fourier-like digit representation observed
Targeted objectives can greatly improve performance

Difficult words

intermediate — values or steps between start and end

fine-tuning — adjusting a trained model with new data

local optimum — a solution that is not globally best

encode — store information in an internal form

encodes

attention — mechanism to weigh or focus information

temporal — related to time or sequence order

objective — a specific goal used during training

objectives

scale — increase model size or computing resources

scaling

mechanism — a structure or process that produces behavior

mechanisms

Tip: hover, focus or tap highlighted words in the article to see quick definitions while you read or listen.

Discussion questions

How might targeted training objectives like the running-sum objective change the reliability of AI in real-world decisions?

Can the attention and intermediate-value strategies described for multiplication be useful for other multi-step tasks? Why or why not?

The article says scaling alone does not fix some limits. What trade-offs should researchers consider between scaling models and adding architectural guidance?

21 Jan 2026

AI helps detect melanoma from skin images

Researchers at the University of Missouri tested artificial intelligence to help detect melanoma from images of skin. They trained models on many pictures and found combined models improved accuracy, aiming to support faster care.

Level

Read

8 Oct 2024

Dementia rising in Africa as researchers seek answers

Dementia is increasing in Africa as populations age. Research and evidence in the region are limited, so scientists study genetics, new detection tools and community measures while working with traditional healers to reduce stigma.

Level

Read

24 Nov 2025

Why some gas-rich volcanoes erupt gently

New research shows gas bubbles can form in magma because of shear inside volcanic conduits. Bubbles may join to make channels that let gas escape early, producing calm flows in some volcanoes.

Level

Read

5 Jan 2026

Egyptian university and pharma join to create Africa’s first biotechnology academy

The American University in Cairo and Minapharm have formed a partnership to set up what the university calls the first African academy for biotechnology. The initiative starts early this year to strengthen education, research and industry links.

Level

Read

17 Feb 2026

Speed training may lower dementia risk in older adults

A long-term randomized study found that adults 65 and older who completed speed-of-processing training, with later booster sessions, were less likely to be diagnosed with dementia up to twenty years later.

Level

Read

New training method helps models do long multiplication^{CEFR B2}

Difficult words

Discussion questions

Related articles

AI helps detect melanoma from skin images

Dementia rising in Africa as researchers seek answers

Why some gas-rich volcanoes erupt gently

Egyptian university and pharma join to create Africa’s first biotechnology academy

Speed training may lower dementia risk in older adults

New training method helps models do long multiplication CEFR B2

Difficult words

Discussion questions

Related articles

AI helps detect melanoma from skin images

Dementia rising in Africa as researchers seek answers

Why some gas-rich volcanoes erupt gently

Egyptian university and pharma join to create Africa’s first biotechnology academy

Speed training may lower dementia risk in older adults

New training method helps models do long multiplication^{CEFR B2}