|
|
|
|
|
by famouswaffles
68 days ago
|
|
I did answer it, albeit not directly. "Guaranteed to be the motivation" isn't a standard anyone can meet, and so framing it that way doesn't really probe anything meaningful about LLMs specifically. If what you want to hear is No, then sure, have your No, but it doesn't mean anything. There's just not much to the question. Even though you had it up as one borne of a greater understanding of LLMs, the interpretability research we have so far, and our current very little understanding of the internal computations of these models does not support your position and certainly not how assured you are about it. |
|
Our current understanding is sufficient to know you can not ask the LLM to explain it's behavior and it can correctly do so, I'm not what research you've read to believe this could be possible in the first place, but happy to receive links to read through, if you're sitting on them.