Hacker News new | ask | show | jobs
by janalsncm 1063 days ago
Are large language models even Turing complete? Or more specifically, is there something we can say about LLMs as a class with respect to this question?

For any architecture like Vaswani’s GPT or a bigger iteration of it, eventually you run out of attention heads and layers.

If the answer is categorically no, then any sufficiently sophisticated code is not “promotable”. However, I don’t think there’s anything in principle which prevents LLMs from being Turing complete.

4 comments

> Are large language models even Turing complete?

Idealized deterministic computing systems are the only thing that can be Turing complete, actual systems cannot be (because Turing completeness requires infinite space), LLMs are actual systems, and also are not limited-space approximation of idealized deterministic systems (they are, I suppose, deterministic if you know all the relevant parameters, including potentially some that are hardware-dependent, but they generally are a deterministic approximation of a nondeterministic system.) You can, of course, prompt an LLM to predict the output of a deterministic system and to do direct computation, but, absent an interface to external tools that actually do the computation, the results for that are notoriously unreliable.

> Idealized deterministic computing systems are the only thing that can be Turing complete

That’s not true. My computer is for all practical purposes Turing complete - it’s tape is not the RAM, but due to side effecting, being connected to the internet, the whole universe. So while the universe itself is finite, nothing material can be mathematically infinite, Turing completeness fails “lazily”. Unless you hit the limits, it is as good as infinite.

As for the current topic, prompting problems are after a while just memoization to some limit with some strange encoding.

> My computer is for all practical purposes Turing complete

“For all practical purposes” is a long way of saying “not”; a large-but-finite tape is not infinite, and the key properties of Turing completeness (both universal computation and the consequent equivalence with all other Turing complete systems) do not hold with “finite but large tapes”, no matter if large is 640 kilobytes or 640 quettabytes. Particularly, differently structured “Turing complete but for finite size” systems of similar actual capacity in bytes are not guaranteed to be able to compute the same subset of all computable results. (Actually Turing machines with the same size tape would be, but “Turing complete but for size” does not imply a consistent ratio between problems of material storage space to equivalent Turing machine tape size.)

Can you give me any way that can differentiate between my practical machine with memory sized n, and a real infinite Turing machine in a finite amount of time t?

If not, than for all purposes the two are the exact same, which is my point. This is not the case with LLMs.

Sure, then a trivial example of a non-“promotable” input would be an input containing a problem whose solution requires more memory than any computer currently has. But I don’t think that’s what they were looking for.
With memory they are. https://arxiv.org/abs/2301.04589
You are mixing up LLMs with Transformers. Transformers with memory are Turing complete, but AFAIK, current state of the art LLMs aren't trained with any kind of memory.
I'm not. The paper specifically deals with language models in particular. Memory is just dynamically inserted context
So as I see it, it's possible to teach SOTA LLM with prompts to emulate a state machine part of the Turing machine. Am I right?
Ok. I see. Sorry.
> Are large language models even Turing complete?

If you want to see where they have problem ask them to do something about deep hierarchical objects. For example, consider this prompt: "Draw me a complete binary tree with numbers from 1 to 128 using pseudographics"

In my experience, the deeper the structure, the more problematic it is for the current generation of LLMs.

attention is turing complete https://news.ycombinator.com/item?id=36332033

I'd guess it's not as efficient as a native algorithm i many cases though

It's not realistically Turing complete. It assumes infinite precision.
Right but a Turing computer assumes infinite storage space which is itself impossible. You cannot have infinite precision without infinite storage, and all real computers that we colloquially say are Turing complete have finite everything.
It's possible to argue that there's no such a thing as infinite precision floats. I.e. all objects have limited "size", and real numbers is just a nice abstraction to simplify discrete world. (https://en.wikipedia.org/wiki/Finitism)

Turing machine is not such thing. At each moment in time, only finitely many cells of the tape are used. (The same applies to natural numbers, there are infinitely many of them, but each one of them has a finite description).

It lazily uses infinite tape though, so it is Turing complete for an actual, finite number of steps. You can’t have infinite precision even for a single step.