Hacker News new | ask | show | jobs
Debugging misaligned completions with sparse-autoencoder latent attribution (alignment.openai.com)
1 points by rd 206 days ago