|
|
|
|
|
by 13years
787 days ago
|
|
What are the solutions? As pointed out in the article, some LLM's appear to know the information when requested to list episodes, then deny it later. These are general inconsistencies. It is not about looking up trivia, it is the fact you never know the competence level of any answer it gives you. |
|
For example, use LLMs to transform text rather than generate it from scratch (where they are prone to hallucinate). General purpose chat-bot is not a great use case!
For this particular Gilligan's Island task it'd be better to first retrieve the list of episode titles (or descriptions if that was needed), then ask the LLM which of them was about "mind reading". There are various ways to do this sort of thing, depending on how specific/constrained the task is you are trying to accomplish. In the most general case you could ask a powerful model like Claude Opus to create a plan composed out of simpler steps, but in other cases your application already knows what it wants to do, and how to do it, and will call an LLM as a tool for specific steps it is capable of.