Huh? That is the literal opposite of what I said. Like, diametrically opposite.
The system summarizes and hashes docs. The model can only answer from those summaries in that mode. There’s no semantic retrieval step.
No, that’s exactly what you wrote.
Now, with this change
SUMM -> human reviews
That would be fixed, but will work only for small KBs, as otherwise the summary would be exhaustive.
Case in point: assume a Person model with 3-7 facts per Person. Assume small 3000 size set of Persons. How would the SUMM of work? Do you expect a human to verify that SUMM? How are you going to converse with your system to get the data from that KB Person set? Because to me that sounds like case C, only works for small KBs.
Again: the proposition is not “the model will never hallucinate.”. It’s “it can’t silently propagate hallucinations without a human explicitly allowing it to, and when it does, you trace it back to source version”.
Fair. Except that you are still left with the original problem of you don’t know WHEN the information is incorrect if you missed it at SUMM time.
Woof, after reading your “contributions” here, are you this fucking insufferable IRL or do you keep it behind a keyboard?
Goddamn. I’m assuming you work in tech in some capacity? Shout-out to anyone unlucky enough to white-knuckle through a workday with you, avoiding an HR incident would be a legitimate challenge, holy fuck.
Hallucination isn’t nearly as big a problem as it used to be. Newer models aren’t perfect but they’re better.
The problem addressed by this isn’t hallucination, its the training to avoid failure states. Instead of guessing (different from hallucination), the system forces a Negative response.
That’s easy and any big and small company could do it, big companies just like the bullshit
Buuuuullshit. Asked different models about the ten highest summer transfer scorers and got wildly different answers. They then tried to explain why amd got more wrong numbers.
I want to believe you, but that would mean you solved hallucination.
Either:
A) you’re lying
B) you’re wrong
C) KB is very small
deleted by creator
So… Rag with extra steps and rag summarization? What about facts that are not rag retrieval?
deleted by creator
Oh boy. So hallucination will occur here, and all further retrievals will be deterministically poisoned?
deleted by creator
No, that’s exactly what you wrote.
Now, with this change
That would be fixed, but will work only for small KBs, as otherwise the summary would be exhaustive.
Case in point: assume a Person model with 3-7 facts per Person. Assume small 3000 size set of Persons. How would the SUMM of work? Do you expect a human to verify that SUMM? How are you going to converse with your system to get the data from that KB Person set? Because to me that sounds like case C, only works for small KBs.
Fair. Except that you are still left with the original problem of you don’t know WHEN the information is incorrect if you missed it at SUMM time.
deleted by creator
Woof, after reading your “contributions” here, are you this fucking insufferable IRL or do you keep it behind a keyboard?
Goddamn. I’m assuming you work in tech in some capacity? Shout-out to anyone unlucky enough to white-knuckle through a workday with you, avoiding an HR incident would be a legitimate challenge, holy fuck.
Hallucination isn’t nearly as big a problem as it used to be. Newer models aren’t perfect but they’re better.
The problem addressed by this isn’t hallucination, its the training to avoid failure states. Instead of guessing (different from hallucination), the system forces a Negative response. That’s easy and any big and small company could do it, big companies just like the bullshit
deleted by creator
A very tailored to llms strengths benchmark calls you a liar.
https://artificialanalysis.ai/articles/gemini-3-flash-everything-you-need-to-know (A month ago the hallucination rate was ~50-70%)
Buuuuullshit. Asked different models about the ten highest summer transfer scorers and got wildly different answers. They then tried to explain why amd got more wrong numbers.