Hacker News new | ask | show | jobs
by blackbear_ 1994 days ago
Any examples of novel insights obtained with this method?
2 comments

What I found most fascinating is identifying neuron firing patterns corresponding to linguistic properties: e.g. groups of neurons that fire in response to verbs, or pronounds.

Scroll down to "Factorizing Activations of a Single Layer" in https://jalammar.github.io/explaining-transformers/ to see those.

The figure above it, titled 'Explorable: Ten Activation Factors of XML' shows how neuron firing patterns in response to XML -- opening tags, closing tags, and even indentation.

It's still fresh, but I'm keen to see what other people uncover in their examinations (or what shortfalls/areas of improvement there are for such a method).

It's also mentioned in this video https://youtu.be/gJPMXgvnX4Y?t=429