Hacker News new | ask | show | jobs
by simonw 584 days ago
Right. I've been saying for a while that if all LLM development stopped entirely and we were stuck with the models we have right now (GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro, Llama 3.1/2, Qwen 2.5 etc) we could still get multiple years worth of advances just out of those existing models. There is SO MUCH we haven't figured out about how to use them yet.
2 comments

> There is SO MUCH we haven't figured out about how to use them yet.

I mean, it's pretty clear to me they're a potentially great human-machine interface, but trying to make LLMs - in their current fundamental form - a reliable computational tool.. well, at best it's an expensive hack, but it's just not the right tool for the job.

I expect the next leap forward will require some orthogonal discovery and lead to a different kind of tool. But perhaps we'll continue to use LLMs as we knownthem now for what they're good at - language.

One of the biggest challenges in learning how to use and build on LLMs is figuring out how to work productively with a technology that - unlike most computers - is inherently unreliable and non-deterministic.

It's possible, but it's not at all obvious and requires a slightly skewed way of looking at them.

This really reminds me of a trend years ago to create probabilistic programming constructs. I think it was just a trend way ahead of its time. Typical software engineers tend to be very ill-suited to think in probabilities and how to build reasonably reliable systems around them.
LLMs use historic data to help create useful current data. It works well sometimes.

I find that a human is able to solve a P=NP situation, and an LLM can’t quite yet do that. When they can the game changes.