You misconstrue the analogy. The robot isn’t equivalent to the code in this analogy. It’s the thing that generates the code.
The robot operates deterministically, it has a fixed input and a fixed output. This is what makes it reliable.
Your “AI coder” is nothing like that. It’s non deterministic on its best day, and it gets everything thrown at it so even more of a coin toss. This seriously undermines any expectation of reliability.
The guy’s comparison shows a lack of understanding of either of the systems.
I totally understand that inversion but I think it's a bad analogy.
Industrial automation works by taking a rigorously specified designs developed by engineers and combining it with rigorous quality control processes to ensure the inputs and outputs remains within tolerances. You first have to have a rigorous spec, then you can design a process for manufacturing a lot of widgets while checking 1 out of every 100 of them for their tolerances.
You can only get away with not measuring a given angle on widget #13525 because you're producing many copies of exactly the same thing and you measured that angle on widget #13500 and widget #13400 and so on and the variance in your sampled widgets is within the tolerances specified by the engineer who designed the widget.
There's no equivalent to the design stage or to the QC stage in the vibe-coding process advocated for by the person quoted above.
I don't know what you mean with "the code it creates is deterministic" but the process an LLM uses to generate code based on an input is definitely not entirely deterministic.
To put it simply, the chances that an LLM will output the same result every time given the same input is low. The LLM does not operate deterministically, unlike the manufacturing robot who will output the same door panel every single time. Or as ChatGPT put it:
> The likelihood of an LLM like ChatGPT generating the exact same code for the same prompt multiple times is generally low.
For any given seed value, the output of an LLM will be identical- it is deterministic. You can try this at home with Llama.cpp by specifying a seed value when you load a LLM, and then seeing that for a given input the output will always be the same. Of course there may be some exceptions (cosmic ray bit flips). Also, if you are only using online models, you can't set the seed value, plus there are multiple models, so multiple seeds. In summary, LLMs are deterministic.
> the process an LLM uses to generate code based on an input is definitely not entirely deterministic
Technically correct is the least useful kind of correct when it's wrong in practice. And in practice the process AI coding tools use to generate code is not deterministic which is what matters. To make matters worse in the comparison with a manufacturing robot, even the input is never the same. While a robot get the exact command for a specific motion and the exact same piece of sheet metal, in the same position, a coding AI is asked to work with varied inputs and on varied pieces of code.
Even stamping metal could be called "non-deterministic" since there are guaranteed variations, just within determined tolerances. Does anyone define tolerances for generated code?
That's why the comparison shows a lack of understanding of either of the systems.
I don't really understand your point.
An LLM is loaded with a seed value, which is a number. The number may be chosen through some pseudo- or random process, or specified manually. For any given seed value, say 80085, the LLM will always and exactly generate the same tokens. It is not like stamped sheet metal, because it is digital information not matter. Say you load up R1, and give it a seed value of 80085, then say "hi" to the model. The model will output the exact same response, to the bit, same letters, same words, same punctuation, same order. Deterministic.
There is no way you can say that an LLM is non-deterministic, because that would be WRONG.
I feel this is technically correct but intentionally cheating. no one - including the model creators - expects that to be the interface; it undermines they entire value proposition of using an LLM in the first place if I need to engineer the inputs to ensure reproducability. I'd love to hear some real world scenarios that do this where it wouldn't be simpler to NOT use AI.
When should a model's output be deterministic?
When should a model's output be non-deterministic?
When many humans interact with the same model, then maybe the model should try different seed values, and make measurements.
When model interaction is limited to a single human, then maybe the model should try different seed values, and make measurements.
The robot operates deterministically, it has a fixed input and a fixed output. This is what makes it reliable.
Your “AI coder” is nothing like that. It’s non deterministic on its best day, and it gets everything thrown at it so even more of a coin toss. This seriously undermines any expectation of reliability.
The guy’s comparison shows a lack of understanding of either of the systems.