Hacker News new | ask | show | jobs
by zahlman 542 days ago
The interesting thing to me is that the scratchpad operates at the level it does. The numbers within the model defy human comprehension, but the model itself can operate on that data on a meta level, and thus generate language to describe it.

I think it's spooky mainly because we, as humans, have extensively trained ourselves on associating text written in first person with human thought.

1 comments

Yeah, though that surprising effect is present in any LLM. You can give it text, and it gives you more text that's spookyly coherant and related. Some of it is going to be in first person because lots of it's training data was in first person. That's still neat! But...

The thing Anthropic really wants us to believe is that rather than just feeding the text it outputs back in, which is a rather banal framing of what they're doing, we've "given it a secret notepad". It's a narrative framing that I think obscures the REAL interesting stuff going on, but I guess LLMs are now too boring now so we need to create some pointless moral drama about it for press.