Hacker News new | ask | show | jobs
by jengels_ 752 days ago
It's a super interesting direction! That's one of the long term goals of interp research: deconstruct model behavior into circuits of features, and then turn those circuits into code (that we can maybe even formally verify!).