Hacker News new | ask | show | jobs
by abdullin 929 days ago
Putting too much information in the context window is counter-productive in my experience. Low signal/noise ratio tends to increate the likelihood of model hallucinations, and we don't want that!

What works in my experience - structuring the task similar to a human-driven workflow, breaking it down into small steps is needed. Each step could be driven by a small prompt, relevant document fragments (if RAG is used) and condensed essays/tutorials/guides that were written by a powerful LLM (ideally, GPT-4 pre-Turbo).

Using this approach, you could stay well below 8k token limit even on the most demanding tasks.

(Big size contexts are leaky on all LLMs anyway)