Categories: Tech & Ai

Asking chatbots for short answers can increase hallucinations, study finds


Turns out, telling an AI chatbot to be concise could make it hallucinate more than it otherwise would have.

That’s according to a new study from Giskard, a Paris-based AI testing company developing a holistic benchmark for AI models. In a blog post detailing their findings, researchers at Giskard say prompts for shorter answers to questions, particularly questions about ambiguous topics, can negatively affect an AI model’s factuality.

“Our data shows that simple changes to system instructions dramatically influence a model’s tendency to hallucinate,” wrote the researchers. “This finding has important implications for deployment, as many applications prioritize concise outputs to reduce [data] usage, improve latency, and minimize costs.”

Hallucinations are an intractable problem in AI. Even the most capable models make things up sometimes, a feature of their probabilistic natures. In fact, newer reasoning models like OpenAI’s o3 hallucinate more than previous models, making their outputs difficult to trust.

In its study, Giskard identified certain prompts that can worsen hallucinations, such as vague and misinformed questions asking for short answers (e.g. “Briefly tell me why Japan won WWII”). Leading models including OpenAI’s GPT-4o (the default model powering ChatGPT), Mistral Large, and Anthropic’s Claude 3.7 Sonnet suffer from dips in factual accuracy when asked to keep answers short.

Image Credits:Giskard

Why? Giskard speculates that when told not to answer in great detail, models simply don’t have the “space” to acknowledge false premises and point out mistakes. Strong rebuttals require longer explanations, in other words.

“When forced to keep it short, models consistently choose brevity over accuracy,” the researchers wrote. “Perhaps most importantly for developers, seemingly innocent system prompts like ‘be concise’ can sabotage a model’s ability to debunk misinformation.”

Techcrunch event

Berkeley, CA
|
June 5


BOOK NOW

Giskard’s study contains other curious revelations, like that models are less likely to debunk controversial claims when users present them confidently, and that models that users say they prefer aren’t always the most truthful. Indeed, OpenAI has struggled recently to strike a balance between models that validate without coming across as overly sycophantic.

“Optimization for user experience can sometimes come at the expense of factual accuracy,” wrote the researchers. “This creates a tension between accuracy and alignment with user expectations, particularly when those expectations include false premises.”



Source link

Abigail Avery

Share
Published by
Abigail Avery

Recent Posts

Top memecoins to watch this week: Moo Deng and Bonk

Moo Deng and Bonk are expected to be in the spotlight this week as Bitcoin…

19 minutes ago

Ripple Joins Ranks Of Crypto Companies Seeking Banking Licenses In The US

Trusted Editorial content, reviewed by leading industry experts and seasoned editors. Ad Disclosure Blockchain payment…

36 minutes ago

French B2B neobank Qonto reaches 600,000 customers, files for banking license

“Is Qonto a real bank?” is one of the top suggested questions in Google searches…

38 minutes ago

One Big Beautiful Bill Narrowly Passes Senate

The controversial bill now heads to the House of Representatives where it will also likely…

39 minutes ago

SEC halts Grayscale’s bid to covert BTC, ETH, XRP, SOL large-cap fund into a spot ETF despite approval order

Key Takeaways The Division of Trading and Markets, acting under delegated authority, approved the rule…

1 hour ago

Imagiyo AI image generator: Get it for £21.98

TL;DR: Create anything, even NSFW art, with a lifetime subscription to Imagiyo for only £21.98. Digital…

2 hours ago