Unfortunately, this is only deterministic on the same hardware, but there is no reason why one couldn't write reasonably efficient LLM kernels. It just has not been a priority.
Nevertheless, I still agree with the main point that it is difficult to get LLMs to produce the same output reliably. A small change in the context might trigger all kinds of changes in the generated code.
Unfortunately, this is only deterministic on the same hardware, but there is no reason why one couldn't write reasonably efficient LLM kernels. It just has not been a priority.
Nevertheless, I still agree with the main point that it is difficult to get LLMs to produce the same output reliably. A small change in the context might trigger all kinds of changes in the generated code.