Hacker News new | ask | show | jobs
by zeroq 10 days ago
But that's exactly what LLMs are. :)

My mental model and go to ELI5 is "imagine you compressed the whole internet into a zip-like archive and you have an extremely clever and efficient way to search it for data".

I'm old enough to remember the time when you could order wikipedia on CDs and I don't see much difference between that and downloading LLM.

1 comments

That is true, but I have to be honest and say that I didn’t make the connection until I saw Bellard’s project for the first time, and I said: “ah! That actually makes A LOT of sense”
My biggest gripe about AI is that very few people actually understand that, and many think that LLMs are "thinking" and capable with "coming up with a novel solution".

They are not. The only reason one might think the solution is novel is because they never saw it before, but what they are actually receiving is an excerpt from someone elses blog post or stack overflow answer. [1]

A bit terrifying thought experiment is to accept for a moment that programming is dead and all its left prompt engineering. Fast forward 5-10-15 years and whos left to actually produce new code and ideas to feed LLMs?

[1] one thing I like to do from time to time - especially when I'm asking for something I know little about - is to copy and paste the answer back to google and look where did that answer originated from.

One time I asked a very specific linux shell command and the answer didn't sit right with me. I googled it and it pointed me to a stackoverflow question. It was the first answer with ~1000 upvotes. But it also had a comment with ~700 upvotes explaining why you never ever should do that. :)

My biggest gripe with the discourse around AI, especially by programmers with hubris about Machine Learning, is the idea that LLMs can’t come up with “novel solutions”. They can, and they have. CoT[0] is how LLMs can output tokens in “reasoning space” to guide their “thinking” to produce absolutely novel solutions. You can imagine reasoning being multi-layered, where the top layer is an abstract heuristic (examples of which can be “try special cases”, “try solving a part of the problem with relaxed constraints”). The lower layers become more and more concrete with the details of the problem, and the result is a solution of the problem.

You don’t even have to understand how modern reasoning LLMs work to be able to tell that your perception is warped and doesn’t reflect reality - there’s plenty of news to the contrary - OpenAI resolving a major Erdos problem[1], the First Proof endeavour[2], amongst others [3].

[0]: https://arxiv.org/abs/2201.11903 [1]: https://openai.com/index/model-disproves-discrete-geometry-c... [2]: https://1stproof.org/assets/docs/report.pdf [3]: https://archive.ph/2w4fi

also a tangent - I can't find it right now - something I feel quite similar, albeit far less practical, was an experiment in which neural network was laid out as a 2D grid, i.e. screen, and it was trained so that specific inputs would fire very specific neurons and in that way the "screen" would show a specific image.

what was particularly interesting about that experiment was the fact that you could pack quite a few images in a very small network.