|
|
|
|
|
by aithrowawaycomm
443 days ago
|
|
I struggled reading the papers - Anthropic’s white papers reminds me of Stephen Wolfram, where it’s a huge pile of suggestive empirical evidence, but the claims are extremely vague - no definitions, just vibes - the empirical evidence seems selectively curated, and there’s not much effort spent building a coherent general theory. Worse is the impression that they are begging the question. The rhyming example was especially unconvincing since they didn’t rule out the possibility that Claude activated “rabbit” simply because it wrote a line that said “carrot”; later Anthropic claimed Claude was able to “plan” when the concept “rabbit” was replaced by “green,” but the poem fails to rhyme because Claude arbitrarily threw in the word “green”! What exactly was the plan? It looks like Claude just hastily autocompleted. And Anthropic made zero effort to reproduce this experiment, so how do we know it’s a general phenomenon? I don’t think either of these papers would be published in a reputable journal. If these papers are honest, they are incomplete: they need more experiments and more rigorous methodology. Poking at a few ANN layers and making sweeping claims about the output is not honest science. But I don’t think Anthropic is being especially honest: these are pseudoacademic infomercials. |
|
I'm honestly confused at what you're getting at here. It doesn't matter why Claude chose rabbit to plan around and in fact likely did do so because of carrot, the point is that it thought about it beforehand. The rabbit concept is present as the model is about to write the first word of the second line even though the word rabbit won't come into play till the end of the line.
>later Anthropic claimed Claude was able to “plan” when the concept “rabbit” was replaced by “green,” but the poem fails to rhyme because Claude arbitrarily threw in the word “green”!
It's not supposed to rhyme. That's the point. They forced Claude to plan around a line ender that doesn't rhyme and it did. Claude didn't choose the word green, anthropic replaced the concept it was thinking ahead about with green and saw that the line changed accordingly.