| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Lerc 751 days ago

I was very impressed with Anthropic's paper on Concept mapping.

Post https://www.anthropic.com/news/mapping-mind-language-model

Paper https://transformer-circuits.pub/2024/scaling-monosemanticit...

This seems like a very good starting point for alignment. One could almost see a pathway to making something like the laws of robotics from here. It's a long way to go, but a good first step.