Hacker News new | ask | show | jobs
by 13years 787 days ago
What are the solutions?

As pointed out in the article, some LLM's appear to know the information when requested to list episodes, then deny it later. These are general inconsistencies.

It is not about looking up trivia, it is the fact you never know the competence level of any answer it gives you.

1 comments

I think what the parent poster meant is that the most useful way to use today's LLMs is to accept their limitations and weaknesses and work around them. Better models will come, but for now this is what you have to do.

For example, use LLMs to transform text rather than generate it from scratch (where they are prone to hallucinate). General purpose chat-bot is not a great use case!

For this particular Gilligan's Island task it'd be better to first retrieve the list of episode titles (or descriptions if that was needed), then ask the LLM which of them was about "mind reading". There are various ways to do this sort of thing, depending on how specific/constrained the task is you are trying to accomplish. In the most general case you could ask a powerful model like Claude Opus to create a plan composed out of simpler steps, but in other cases your application already knows what it wants to do, and how to do it, and will call an LLM as a tool for specific steps it is capable of.