Hacker News new | ask | show | jobs
by spacebacon 8 days ago
On problems this close to active research, seeing the model’s internal reasoning at the points of highest effort is more valuable than pass/fail outcomes alone, which is what SRT-Introspect makes possible on frozen models.

https://github.com/space-bacon/SRT