|
|
|
|
|
by vlovich123
337 days ago
|
|
> The problem with the former (output) is that you cannot guarantee the output of an AI on a consistent basis Do you mean you cannot guarantee the result based on a task request with a random query? Or something else? I was under the impression that LLMs are very deterministic if you provide a fixed seed for the samplers, fixed model weights, and fixed context. In cloud providers you can't guarantee this because of how they implement this (batching unrelated requests together and doing math). Now you can't guarantee the quality of the result from that and changing the seed or context can result in drastically different quality. But maybe you really mean non-deterministic but I'm curious where this non-determinism would come from. |
|
That's all input-side, though. On the output side, you can essentially give an LLM anxiety by asking the exact same question in different ways, and the machine doesn't understand anymore that you're asking the exact same question.
For instance, take one of these fancy "reasoning" models and ask it variations on 2+2. Try two plus two, 2 plus two, deux plus 2, TwO pLuS 2, etc, and observe its "reasoning" outputs to see the knots it ties itself up in trying to understand why you keep asking the same calculation over and over again. Running an older DeepSeek model locally, the "reasoning" portion continued growing in time and tokens as it struggled to provide context that didn't exist to a simple problem that older/pre-AI models wouldn't bat an eye at and spit out "4".
Trying to wrangle consistent, reproducible outputs from LLMs without guaranteeing consistent inputs is a fool's errand.