| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by boh 411 days ago
	I think all the retail LLM's are working to broaden the available context, but in most practical use-cases it's having the ability to minimize and filter the context that would produce the most value. Even a single PDF with too many similar datapoints leads to confusion in output. They need to switch gears from the high growth, "every thing is possible and available" narrative, to one that narrows the scope. The "hallucination" gap is widening with more context, not shrinking.

7 comments

fhd2 411 days ago

Definitely my experience. I manage context like a hawk, be it with Claude-as-Google-replacement or LLM integrations into systems. Too little and the results are off. Too much and the results are off.

Not sure what Anthropic and co can do about that, but integrations feel like a step in the wrong direction. Whenever I've tried tool use, it was orders of magnitude more expensive and generally inferior to a simple model call with curated context from SerpApi and such.

link

loufe 411 days ago

Couldn't agree more. I wish all major model makers would build tools into their proprietary UIs to "summarize contents and start a new conversation with that base". My biggest slowdown with working with LLMs while coding is moving my conversation to a new thread because context limit is hit (Claude) or the coherent-thought threshold is exceeded (Gemini).

link

fhd2 411 days ago

I never use any web interfaces, just hooked up gptel (an Emacs package) to Claude's API and a few others I regularly use, and I just have a buffer with the entire conversation. I can modify it as needed, spawn a fresh one quickly etc. There's also features to add files and individual snippets, but I usually manage it all in a single buffer. It's a powerful text editor, so efficient text editing is a given.

I bet there are better / less arcane tools, but I think powerful and fast mechanisms for managing context are key and for me, that's really just powerful text editing features.

link

roordan 411 days ago

This is my concern as well. How successful is it in selecting the correct tool out of hundreds or thousands?

Different to what this integration is pushing, the LLMs usage in production based products where high accuracy is a requirement (99%), you have to give a very limited tool set to get any degree of success.

link

Etheryte 411 days ago

This has been my experience as well. The moment you turn internet access on, Kagi Assistant starts outputting garbage. Turn it off and you're all good.

link

energy123 411 days ago

There's a niche for the kitchen sink approach. It's a type of search engine.

Throw in all context --> ask it what is important for problem XYZ --> curate what it tells you, and feed that to another model to actually solve XYZ

link

medhir 411 days ago

you hit the nail on the head. my experience with prompting LLMs is that providing extra context that isn’t explicitly needed leads to “distracted” outputs

link

ketzo 411 days ago

I mean, to be honest, they gotta do both to achieve what they’re aiming for.

A truly useful AI assistant has context on my last 100,000 emails - and also recalls the details of each individual one perfectly, without confusion or hallucination.

Obviously I’m setting a high bar here; I guess what I’m saying is “yes, and”

link

mikepurvis 411 days ago

That's a tough pill to swallow when your company valuation is a $62B based on the premise that you're building a bot capable of transcendent thought, ready to disrupt every vertical in existence.

Tackling individual use-cases is supposed to be something for third party "ecosystem" companies to go after, not the mothership itself.

link