Hacker News new | ask | show | jobs
by embedding-shape 67 days ago
> our current very little understanding of the internal computations of these models does not support your position

Our current understanding is sufficient to know you can not ask the LLM to explain it's behavior and it can correctly do so, I'm not what research you've read to believe this could be possible in the first place, but happy to receive links to read through, if you're sitting on them.

1 comments

Explanations can be faithful sometimes. That's the standard we can expect for any intelligence as far as we're aware.

https://arxiv.org/abs/2504.14150