Hacker News new | ask | show | jobs
by saretup 218 days ago
Seems reductive. Some applications require higher context length or fast tokens/s. Consider it a multidimensional Pareto frontier you can optimize for.
1 comments

It's not just that some absolutely require it, but a lot of applications hugely benefit from more context. A large part of LLM engineering for real world problems revolves around structuring the context and selectively providing the information needed while filtering out unneeded stuff. If you can just dump data into it without preprocessing, it saves a huge amount of development time.
Depending on the application, I think “without preprocessing” is a huge assumption here. LLMs typically do a terrible job of weighting poor quality context vs high quality context and filling an XL context with unstructured junk and expecting it to solve this for you is unlikely to end well.

In my own experience you quickly run into jarring tangents or “ghosts” of unrelated ideas that start to shape the main thread of consciousness and resist steering attempts.

It depends to the extent I already mentioned, but in the end more context always wins in my experience. If you for example want to provide a technical assistant, it works much better if you can provide an entire set of service manuals to the context instead of trying to put together relevant pieces via RAG.