Hacker News new | ask | show | jobs
by furyofantares 98 days ago
When the dust settles, for example if LLM's were to stop improving today, we would come to learn their exact capabilities, what they can do reliably and what they can't.

Once we know what they can do well and how to get them to do it well, and what they can't, you could say we "trust" them to do the first category well and just stop trying to get it to do the second category.

2 comments

This feeds the adoption problem, though: a lot of companies are thinking "why settle for the current models when even the vendors are saying the models in six months will be exponentially better? Let's let the early adopters work out the bugs and move when these things are more stable"
LLMs are random by nature, they might something done one time but miserably fail the next
I think we're getting to a point where LLM randomness is relevant to someone writing a white paper on LLMs, but not as relevant to consumers of them. Yes the technology uses randomness, but the quality of response somehow still seems very consistent and predictable in 2026.
Yeah, and we will continue to learn to use them where the amount of random failure is acceptable or can be mitigated or reduced with additional tools.