Hacker News new | ask | show | jobs
by scosman 248 days ago
Saying it isn't useful is a bit of an overstatement. It can search, churn through 500k words in a few minutes, and come back with summaries, answers, and sources for each point.

Should you blindly trust the summary? No. Should you verify key claims by clicking through to the source? Yes. Is it still incredibly useful as a search tool and productivity booster? Absolutely.

3 comments

I gave it a PDF recently and asked it to help me generate some tables based on the information there in. I thought I'd be saving myself time. I spent easily twice as long as I would have if it I had done it myself. It kept making trivial mistakes, misunderstanding what was in the PDF, hallucinating, etc.
Last summer I used one of the models to help translate a few German wikipedia pages to English, hoping it would make things easier by keeping all the formatting etc. that I'd lose if I copy-pasted mere content via Google Translate.

I did check the translations were correct as part of this — while my German isn't great, it was sufficient for this — and it was fine up until reaching a long table about the timeline of events relevant to the subject, at which point it couldn't help but make stuff up.

Still useful, but when you find the limits of their competence, there's no point attempting to cajole them to go further. They'll save you whatever % of the task in effort, now you have to do all the rest; it's a waste of effort to think either carrot or stick will get them to succeed if they can't do it in the first few tries.

It is excellent when just finding something is enough. Most often in my practice, I am dealing with questions that have no written-down answers, meaning the probability of finding a book/article that provides one is negligible. Instead, I am looking for indirect answers or proofs before I make a final engineering decision. Yet another problem is that the language itself changes over time. For instance, at the beginning of the 20th century, the integers were called integral numbers. IMHO, LLMs poorly handle such cases when considered as a substitute for search engines. For full-text vector search, I am using https://www.recoll.org/ a real time saver for me, especially for desktop search.
> GPT-5 is proving useful as a literature review assistant

> No, it does not.

> It is excellent when just finding something is enough.

I meant that it obviously fits your needs but not mine
> No, it does not. It only produces a highly convincing counterfeit.

How could you say that with high confidence when you admitted it might be useful for others?

Because this is precisely what the word counterfeit means, an imitation that deceives you. The functionality of counterfeit can be from 0% to 100%, depending on your luck. If you accidentally bought a fake iPhone, it can still make calls. Chat output is something that looks like a literature review, some collection of summaries of relevant papers. But the problem is that a review is not just text compression. It is also about rejecting low-quality research, considering the historical context, identifying contradictions, etc. No machine can do that analysis for you, yet. I have quite a confidence in that.
If I knew what’s in the paper I don’t need the summary but if I don’t know what’s in the paper I cannot possibly judge the accuracy of its summary.
If we're talking about literature search here, the workflow is

1. Get list of sources and their summaries from an LLM.

2. Read through, find a paper who's title and summary seem interesting to you.

3. Follow the LLM's link, usually to an arXiv posting.

4. Read the title and abstract on arXiv. You can now judge the accuracy of the LLM's summary.

It's really easy to tell if the LLM is accurate when it is linking to something which has its own title and summary, which is almost always the case in literature search.