Hacker News new | ask | show | jobs
by mdp2021 312 days ago
Anything articulate (hence possibly convincing) which could be «merely [guessing]» should either be locked out of consequential questions, or fixed.
1 comments

We're still on that's just how it works. The LLM isn't aware of any consequence, etc. All it does is complete patterns as trained. And the data contains many instances of articulate question answering.

It is for those using the LLM to be aware of its capabilities, or not - be allowed to - use it. Like a child unaware that running their finger on a sharp knife blade will lead to a bad slice; you don't dull the blade to keep the child safe, but keep the child from the knife until they can understand and respect its capabilities.

If your prototype of the «knife» is all blade and no handle, fix it and implement the handle.

If the creation is planned, you will have also thought of the handle; if it is a serendipity, you will have to plan the handle afterwards.

Pretty sure it doesn't matter to the child whether the knife has a handle or not. They'll eventually find a way to cut themself.
It matters to the adult - who is also an user.

LLMs do not deliver (they miss important qualities related to intelligence); they are here now; so they must be superseded.

There is no excuse: they must be fixed urgently.

LLMs deliver pretty well on their intended functionality: they predict next tokens given a token history and patterns in their training data. If you want to describe that as fully intelligent, that's your call, but I personally wouldn't. And adding functionality that isn't directly related to improving token prediction is just bad practice in an already very complex creation. LLM tools exist for that reason: they're the handles, sheaths, sharpeners, etc for the knife. Teach those adults who're getting themselves cut to hold the knife by the handle and use the other accessories that improve user experience.
> given a token history and patterns in their training data. If you want to describe that as fully intelligent

No, I would call (an easy interpretation of) that an implementation of unintelligence. Following patterns is what an hearsay machine does.

The architecture you describe at the "token prediction" level collides with an architecture in which ideas get related with better justifications than frequent co-occurrance. Given that the outputs will be similar in form, and that "dubious guessers" are now in place, we are now bound to hurry towards the "certified guessers".