| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by curao_d_espanto 106 days ago
	> The emerging field of mechanistic interpretability suggests otherwise. Researchers are developing tools to understand how neural networks do what they do, from network ablation and selective activation to feature visualization and circuit tracing. These techniques let you study a trained model the way a biologist studies an organism, through careful experimentation and observation. honestly, when I read that part of the article I imagined that author never studied how computers were made and where the engineering ideas came from, all technology just "popped" and here we are talking about complexity and stuff like the LLM is truly alive

1 comments

The author is not wrong. You seem unaware of how nascent the field of LLM interpretability research is.

See this thread and article from earlier today showing what we're still able to learn from these interpretability experiments.