Hacker News new | ask | show | jobs
Extracting 100K concepts from an 8B LLM (guidelabs.ai)
2 points by adebayoj 108 days ago
1 comments

Hey HN we recently released Steerling-8B, an 8B model designed to be interpretable from the ground up. The model has ~100K concept slots it fills on its own during training, and we can read off what each one means by projecting into vocabulary space.

The model figured out things like British vs. American spelling, second-person pronouns across 6+ languages, and even broken Unicode.

Take a look, and let us know what you think.