HN Reader

OpenAI’s latest research paper demonstrates that falsehoods are inevitable

Saying “I don’t know” to 30% of queries if it actually doesn’t know, is a feature I want. Otherwise there is zero trust. How do I know that I’m in a 30% wrong or 70% correct situation right now?

4 months agoby binarymax

> Users accustomed to receiving confident answers to virtually any question would likely abandon such systems rapidly.

Or maybe they would learn from feedback to use the system for some kinds of questions but not others? It depends on how easy it is to learn the pattern. This is a matter of user education.

Saying "I don't know" is sort of like an error message. Clear error messages make systems easier to use. If the system can give accurate advice about its own expertise, that's even better.

4 months agoby skybrian

A better headline might be "OpenAI research suggests reducing hallucinations is possible but may not be economical".

4 months agoby gary_0

This is written by someone who has no idea how transformers actually work

4 months agoby danjc

I felt this was such a cogent article on business imperatives vs fundamental transformer hallucinations, couldn’t help but HN-submit. In fact seems like a stealth plea for uncertainty-embracing benchmarks industry-wide.

4 months agoby ricksunny

The rss feed title didn’t seem to align with the content. As more said they hallucinate because they were trained to give answers and a “I’m don’t know” is penalized as much as a wrong answer. And they say they don’t fix this because if 30% of time people got a I dunno they would stop using it. I don’t see how telling a user when they aren’t confident of the answer would cause ChatGPT to stop completely tomorrow. People like answers but most assume the answers are correct and would be very helpful to know when the bot isn’t sure. It could say there isn’t enough credible information or their is a lot of conflicting information on the matter and then say what those different potential answers are or how they user might confirm the answer etc. Seems like many options. And you could always let the user choose in preferences if they’d always prefer answer or not.

4 months agoby TechRemarker

A straightforward solution to the author's problem is to offer both modes of answering, with errors or with "IDK" answers. Even charge more for the IDK version if it costs more, and the error-prone version can be "cheap and cheerful"...

4 months agoby toss1

The more “terms of art” a field uses, the more errors the AI makes and the harder time it has understanding. Hence, areas like law and medicine, rich in specialized jargon, frequently pose challenges for AI.

I use it for legal work and I find it just has issues when a term is defined in a very very specific way in one section and another way in another. It obviously has a lot of semantics when they rewrite the definitions of words in separate code sections and separate contracts.

4 months agoby daft_pink

"What is the real meaning of humility?

AI Overview

The real meaning of humility is having an accurate, realistic view of oneself, acknowledging both one's strengths and limitations without arrogance or boastfulness, and a modest, unassuming demeanor that focuses on others. It's not about having low self-esteem but about seeing oneself truthfully, putting accomplishments in perspective, and being open to personal growth and learning from others."

Sounds like a good thing to me. Even, winning.

4 months agoby lif

Isn't it even simpler? There are no (or almost no) questions in the training data that the correct answer to is "I don't know".

Once you train model within specific domain and add to training data out of domain questions or unresolvable questions within domain things will improve.

The question is, is this desirable if most of users grew to love sycophantic confident confabulators.

4 months agoby scotty79

This is a branding problem.

Calling the model ‘calibrated’ or ‘honest’ or ‘humble’ suffers from what is called out: people don’t want a humble answer of ‘I don’t know’, they want a solution to their problem, confidently delivered so they can trust it.

Call the calibrated model ‘business mode’ and the guessing one ‘consumer mode’, problem solved… in as much capacity as possible without regulation.

4 months agoby baq

DeepSeek-V3 confidently provided three different incorrect dates across separate attempts

Hallucination is an issue. But so is the lack of consistency.

The same question asked on different occasions can produce different results. Not just variations in wording but in meaning as well.

There is nothing determinate or consistent about the output. And this makes the output unreliable.

4 months agoby jqpabc123

What about generating an answer, scoring its confidence in parallel. Then running a second llm to re phrase the answer accordingly: 'I vaguely remember Bobs birday is 1st March but I may be wrong, I should search the web"

4 months agoby miellaby

This is why a neurosymbolic system is necessary, which Aloe (https://aloe.inc) recently demonstrated exceeds performance of frontier models, using a model agnostic approach.

4 months agoby justcallmejm

The author doesn't bother to consider that giving a false response already leads to more model calls until a better one is provided.

4 months agoby fumeux_fume

Easily solved, pairs of models, one which would rather say IDK, one which would rather guess. Most AI agents would want the IDK version.

4 months agoby jasfi

From the abstract of the paper [^0]:

> Like students facing hard exam questions, large language models sometimes guess when uncertain, producing plausible yet incorrect statements instead of admitting uncertainty

This is a de facto false equivalence for two reasons.

First, test takers that are faced with hard questions have the capability of _simply not guessing at all._ UNC did a study on this [^1] by administering a light version of the AMA medical exam to 14 staff members that were NOT trained in the life sciences. While most of the them consistently guessed answers, roughly 6% of them did not. Unfortunately, the study did not disambiguate correct guesses versus questions that were left blank. OpenAI's paper proves that LLMs, at this time of writing, simply do not have the self-awareness of knowing whether they _really_ don't know something, by design.

Second, LLMs are not test takers in the pragmatic sense. They are query answerers. Bar argument settlers. Virtual assistants. Best friends on demand. Personal doctors on standby.

That's how they are marketed and designed, at least.

OpenAI wants people to use ChatGPT like a private search engine. The sources it provides when it decides to use RAG are there more for instilling confidence in the answer instead of encouraging their users to check its work.

A "might be inaccurate" disclaimer on the bottom is about as effective as the Surgeon General's warning on alcohol and cigs.

The stakes are so much higher with LLMs. Totally different from an exam environment.

A final remark: I remember professors hammering "engineering error" margins into us when I was a freshman in 2005. 5% was what was acceptable. That we as a society are now okay with using a technology that has a >20% chance of giving users partially or completely wrong answers to automate as many human jobs as possible blows my mind. Maybe I just don't get it.

[^0] https://arxiv.org/pdf/2509.04664

[^1] https://www.rasch.org/rmt/rmt271d.htm

4 months agoby nunez

The article says "Consider the implications if ChatGPT started saying “I don’t know” to even 30% of queries – a conservative estimate based on the paper’s analysis of factual uncertainty in training data. Users accustomed to receiving confident answers to virtually any question would likely abandon such systems rapidly." Maybe. But not me. I would trust it more, and rely on it even more. I can work with someone who says I don't know but is super smart. And I'll bet more people will do the same. Over time, the system may enjoy the rewards of communal trust over and above what it currently enjoys. However, over the long time, this may lead to a more dystopian version of what might happen currently. We may all give blind trust because we all trust it. Given a decade or half of that, and then the system going wrong....Yikes. We have to grapple with the ongoing advice that "ChatGPT can make mistakes. Check important info." And we do. Because we have to, or at least some of us do. And that is a good thing.

4 months agoby afspear

We have always known LLMs are prediction machines. How is this report novel?

4 months agoby pdntspa