Hacker News new | ask | show | jobs
by Lerc 751 days ago
I was very impressed with Anthropic's paper on Concept mapping.

Post https://www.anthropic.com/news/mapping-mind-language-model

Paper https://transformer-circuits.pub/2024/scaling-monosemanticit...

This seems like a very good starting point for alignment. One could almost see a pathway to making something like the laws of robotics from here. It's a long way to go, but a good first step.