A team led by Xiaoyan Bai and Chenhao Tan at the University of Chicago, with collaborators from MIT, Harvard, the University of Waterloo and Google DeepMind, studied why state-of-the-art language models fail at long multiplication. They focused on long-range dependencies: the need to hold partial products and running sums to reach a correct final answer.
Under standard fine-tuning, models with two to 12 layers achieved less than 1% accuracy on four-digit multiplication; the researchers concluded these models fell into a local optimum by learning surface patterns rather than storing intermediate values. In contrast, a model trained with Implicit Chain of Thought (ICoT) reached 100% accuracy. Probing the ICoT model showed that its hidden states encoded intermediate values and that running sums could be decoded.
The team also tested a simple training objective that teaches a model to track running sums at each step. Adding that objective to a two-layer model raised accuracy to 99% and produced attention patterns similar to ICoT. The study argues that architectural guidance and targeted objectives can enable multi-step reasoning.
Difficult words
- long-range dependency — need to keep information across many stepslong-range dependencies
- partial product — a number from one multiplication steppartial products
- running sum — a total that updates after each steprunning sums
- fine-tuning — training a model on new task data
- local optimum — a solution that is not best overall
- implicit chain of thought — training method that encourages stepwise reasoningImplicit Chain of Thought (ICoT)
Tip: hover, focus or tap highlighted words in the article to see quick definitions while you read or listen.
Discussion questions
- Why is it helpful for a model to store intermediate values when doing long multiplication?
- Do you think the same training objective (tracking running sums) could help models in other multi-step tasks? Why or why not?
- Which is more important for multi-step reasoning: model architecture or specific training objectives? Explain with simple reasons.
Related articles
Dementia rising in Africa as researchers seek answers
Dementia is increasing in Africa as populations age. Research and evidence in the region are limited, so scientists study genetics, new detection tools and community measures while working with traditional healers to reduce stigma.
Egyptian university and pharma join to create Africa’s first biotechnology academy
The American University in Cairo and Minapharm have formed a partnership to set up what the university calls the first African academy for biotechnology. The initiative starts early this year to strengthen education, research and industry links.