Detecting hallucinations in large language models using semantic entropy

Salamander · 8 months ago

Detecting hallucinations in large language models using semantic entropy

@[email protected] · 8 months ago

We automatically decompose a long generated answer into factoids. For each factoid, an LLM generates questions to which that factoid might have been the answer. The original LLM then samples M possible answers to these questions. Finally, we compute the semantic entropy over the answers to each specific question, including the original factoid. Confabulations are indicated by high average semantic entropy for questions associated with that factoid.

It sounds like they verify one LLM’s answer by getting a second one to ask the same question over and over again in slightly different ways and see if it’s answers stay the same, interesting over each piece of the answer so that essentially the original prompt is exploded and then montecarloed.

TragicNotCute · 8 months ago

What the hell? 'Scuse me. Who’s watchin these AIs?

Uh - the fat one’s watchin the little one?

Detecting hallucinations in large language models using semantic entropy

Detecting hallucinations in large language models using semantic entropy

Detecting hallucinations in large language models using semantic entropy - Nature