Hacker News new | ask | show | jobs
by NetRunnerSu 335 days ago
On the other hand, we can also diagnose LLM itself: the activation value is their EEG, the gradient is their BOLD - if you are at the cost, you can even calculate their true variational free energy - that is, KL divergence.

"Don't just train your model, understand its mind."

https://github.com/dmf-archive/