Hacker News new | ask | show | jobs
by mike_hearn 559 days ago
Again, read the papers. They absolutely do know facts, and that can be seen in the activations. Your description is oversimplified. It's easy to get models to emit statistically improbable but correct sequences of words. They are not just looking at what words are near by each other, that doesn't lead to the kind of output LLMs are capable of.
1 comments

Exactly. People forget that we did make systems that were just Markov chains long before LLMs, like the famous Usenet Poster "Mark V. Shaney" (created by Rob Pike of Plan 9 and Golang fame) that was trained on Usenet posts in the 1980s. You didn't need deep learning or any sort of neural nets for that. It could come up with sentences that sometimes made some sort of sense, but that was it. The oversimplified way LLMs are sometimes explained makes it sound like they are no different from Mark V. Shaney, but they obviously are.

https://en.wikipedia.org/wiki/Mark_V._Shaney