| HN Mirror

Wow, I don't think you understood at all.

First, the "13+7" is an analogy. In this analogy, "13+7" is not the real question you ask, it represents _any questions_, not just arithmetic.

But secondly, did you even noticed that in my example, the system answer CORRECTLY "13+7"? So, in my example, the thing I'm talking about and I argue does not "understand" is Claude, even if it is able to answer correctly.

My point is: the "basic LLM" part is creating a mechanism that answer without understanding (as demonstrated for example by ChatGPT failing arithmetic), and the fine-tuning or the harness is just hiding the lack of understanding by adding ad-hoc correction on the residuals. And because it is on the residuals, it looses the logical links (13+7 -> 20 is "logical", it corresponds to the math logic, it corresponds to what you get when you add 13 stones and 7 stones together. The residual is "14 -> 20", which has no meaning in itself)

The ad-hoc correction is either: 1. by training the model so it learns by heart, without understanding, that the symbols "13+7" should lead to "20", 2. or by training the model to use a pocket calculator without understanding arithmetic so it can do it itself.

You can prove that the model does not understand it very simply. Let's take the normal fine-tuned model M1. Now, let's go back to the pre-tuned version, and fine-tune it so it answer "21" to the question "13+7", and use an harness that does "sum(x, y): return x+y+1". This is model M2. M2 will fail to answer "13+7" correctly, it will say "21". And yet, M2 has been trained exactly the same way M1 was. If it is true that the additional tuning "add understanding", M2 will not be possible, it will say "error, error, do not compute, you try to train me to say that 13+7 is 21, but it does not make logical sense to me". But it does not happen: the pre-tuned model has no idea that 13+7=20 is more logical than 13+7=21, and the additional tuning is just helping him returning a more correct answer while still having no idea where this answer comes from.