- LLMs can give advice or instructions to online users.
- This kind of advice can be dangerous sometimes too.
- Researchers studied safety in models at a university recently.
- They want models to avoid harming people online directly.
- Safety training can make model answers less accurate sometimes.
- Some safety checks are easy for users to bypass.
- The team found important parts inside the models recently.
- They froze some parts so safety stayed the same.
Difficult words
- advice — words that tell someone what to do
- dangerous — likely to cause harm or hurt people
- researcher — people who study and test thingsResearchers
- safety — the state of no danger for people
- accurate — correct and true, not wrong
- bypass — go around a rule or system
Tip: hover, focus or tap highlighted words in the article to see quick definitions while you read or listen.
Discussion questions
- Do you use online advice?
- Have you seen wrong advice online?
- Do you worry about safety online?
Related articles
Indonesia tightens rules for digital platforms
Indonesia is increasing regulation of global digital platforms to curb misinformation and protect public safety. Officials inspected a major company's office, require platform registration, and use takedown systems, which has drawn criticism over unclear rules and rights.
Connie Nshemereirwe: linking science, policy and education in Africa
Connie Nshemereirwe is an educational measurement specialist and former engineer who promotes Africa-led research, better science communication and stronger ties among scientists in the global South. She also directs the Africa Science Leadership Program.
Targeting inflammation as a way to treat depression
A federally funded review and meta-analysis found that anti-inflammatory treatments reduced depressive symptoms and eased anhedonia in people with depression who had high inflammation. The drugs were not FDA-approved for depression and would be used off-label.