Hacker News new | ask | show | jobs
by crazygringo 1134 days ago
Wow, that's the most useful ChatGPT summary I've ever come across.

I've never found it particularly useful for most articles which are easy enough to read/skim (the first and last 2 paragraphs will usually tell you what you need), but long complicated legal documents are a whole other matter. This is great.

2 comments

Works really well for legislation, too. GPT4 can pick up on deltas between bills using markup like <strike> if kept in the document, summarizing changes to the bill as it moves through the legislative process.

The only challenge is chunking the larger bills and synthesizing the larger summary without losing out on possible nuances. Something like California's SB423, for example, is over twice the 8K token limit and that's not even a large bill.

Unfortunately, things like the US Code or Code of Federal Regulations are in the range of 100s of millions of tokens.

Assuming it’s accurate.
ChatGPT picks up 80% of the meaning and rewrites it in beautiful prose. Or maybe another language, in the style of Shakespeare.

On the other hand, if you're in a field where there's an adversarial use of text and the uncomprehended 20% might be used to nullify, contradict or make loopholes in the main body, then relying on ChatGPT is similar to using Tesla Full Self-Driving in a construction zone, near firetrucks, during a snowstorm.

Has ChatGPT been caught hallucinating on summarization tasks?

My impression was that hallucination happened when it simply didn't have facts in the first place, had conflicting facts, etc.

I thought summarization was generally fairly reliable, but I'd be happy to know if this is not the case.

Every summmarization is a choice of salience: what to include and what to leave ou, and how to express something in a different way.

The failure foolishly and misleadingly called “hallucination” is only one manifestation of an attribution error. If your summarizer leaves out something very important because it doesn’t understand it the result will be quite misleading.

For your average web text which these days is 90% filler and not important anyway, this is no big deal. This particular lawsuit appears the same. But for anything important, I wouldn’t trust it.

In my experience it’s generally accurate when summarizing content provided in the prompt context. Where it can run into trouble is “recalling” (if you can call it that) content that it was trained on.
Its accurate. (I read the whole filing.)