Hacker News new | ask | show | jobs
by qcnguy 254 days ago
Anthropic have done some great work on neural interpretability that gets at the core of this problem.