Hacker News new | ask | show | jobs
by turtleyacht 57 days ago
What was its result of two plus three?
1 comments

It produces an arithmetic program but with wrong operands. The frozen LLM's hidden states for "two" and "2" are nearly orthogonal (cosine sim 0.09) in this context, so the head can't extract the right numbers. "2 plus 3" works fine and draws 5. The model understands the task structure but can't bridge word-form to digit-form without token generation