| HN Mirror

> I expected a response also in ...

Exactly, you expected it, but that doesn't change what's actually happening. The model doesn't know what you expect. It can't read your mind. The best it can do is infer some things, such as that English input should produce English output - and the models are indeed pretty good at that!

> to something that comports with reality.

This is a rather unrealistic expectation in general, when you examine it. You raised a good example with which to do that, though:

> it actually wouldn't be clever or insightful for someone to say "no that's just a result of how the computer was programmed! You're imposing a human understanding of correctness on your bank balance!"

You're right, it wouldn't, because that's a very different situation which helps illustrate the point. The code for the bank app has been written to match your notion of correctness. That's only possible because it has a narrowly defined, specific purpose. It has all the necessary information needed to produce a correct response. The acceptance criteria are clear, including validation and integrity checks on the response. As a result, your expectations should be satisfied, and if they aren't, it makes sense to say that the bank app is not correct.

None of that applies to the AI models we're discussing. An LLM or image model doesn't have a narrowly defined, specific purpose. It can't possibly have access to all the information it needs to "answer" any possible "question" "correctly". It can't possibly have access to acceptance criteria specific to a question unless they're provided explicitly and in detail as part of a prompt - again, underscoring the importance of prompt engineering. And its ability to validate responses - check whether they "comport with reality" - is very limited, at least currently.

An example that's closer to the situation with an AI model would be a tool like a hammer. If you hold a hammer by its head and try to hammer in a nail with its handle, is the hammer "incorrect" when it fails at the task you have "asked" it to do?

> I don't care why it's giving me wrong information.

Just as with the hammer, if you want to be able to use these tools effectively, you should care why.