Hacker News new | ask | show | jobs
by yunohn 605 days ago
How does the next machine/LLM know what’s valid or not? I don’t really understand the idea behind layers of hallucinating LLMs.
1 comments

By comparison with reality. The initial LLMs had "reality" be "a training set of text", when ChatGPT came out everyone rapidly expanded into RLFH (reinforcement learning from human feedback), and now there's vision and text models the training and feedback is grounded on a much broader aspect of reality than just text.
Given that there are more and more AI generated texts and pictures that ground will be pretty unreliable.
Perhaps. But CCTV cameras and smartphones are huge sources of raw content of the real world.

Unless you want to take the argument of Morpheus in The Marix and ask "what is real?"

So let’s crank up total surveillance for better auto descriptions of a picture.

We aren’t exchanging freedom for security anymore, what could be reasonable under certain conditions, we just get convenience. Bad deal.

That's one way to do it, but overkill for this specific thing — self-driving cars or robotics, or natural use of smart-[phone|watch|glass|doorbell|fridge], likely sufficient.

Total surveillance may be necessary for other reasons, like making sure organised crime can't blackmail anyone because the state already knows it all, but it's overkill for AI.

Could you link to a paper or working POC that shows how this “turtles all the way down“ solution works?
I don't understand your question.

This isn't turtles all the way down, it's grounded in real world data, and increasingly large varieties of it.

How does the AI know it’s reality and not a fake image or text fed to the system?
I refer you to Wachowski & Wachowski (1999)*, building on previous work including Descartes and A. J. Ayer.

To whit: humans can't either, so that's an unreasonable question.

More formally, the tripartite definition of knowledge is flawed, and everything you think you know has a Munchausen trilemma.

* Genuinely part of my A-level in philosophy

So we get the same flaws as before with a higher power consumption.

And because it’s fast and easy we now get more fakes, scams and disinformation.

That makes AI a lose-lose not to mention further negative consequences.

I wasn’t expecting your response to be “the truth is unknowable”, but was hoping for something of more substance to discuss.
Does the AI need to know or the curator of the dataset? If the curator took a camera and walked outside (or let a drone wander around for a while), do you believe this problem would still arise?