| HN Mirror

Code can be deterministic if you're only doing trivial things or have very simple systems. Beyond that it only has sufficient determinism to cross a threshold into being considered useful. Dead letter queues, uncaught errors, kernel panics, race conditions, deadlocks, cosmic bit flips, dumb avoidable bad choices from unexpressive languages, malicious actors, resource constraints, and so on; the real world of software we live in is a duct taped together mess of half measures that only mostly do what we want at its best. So much of the work in product programming is handling all the things that can go wrong.

So yeah, prompt "engineering" is indeed a silly term, but software "engineering" kicked off the dilution of that word ages ago. And GPT models can be inspected and measured for input and output, prompts can be analyzed for their effects and usefulness, temperature settings even directly control some degree of determinism. It's not like models change on a whim unless you're just using end user products. Anthropic, Huggingface, AWS, OpenAI, they let you pick a release model version in your API calls and stick with it for a long time. If you're self hosting a fine tuned Llama 70b, nobody will ever force you to update it if you get it doing a task to your expectations. The quality of deterministic behavior in AI is currently lower than that of Excel or C code, but it's also serving a wholly different purpose, people want it to be creative and create novel nondeterministic outputs, comparing them is a bit silly.