Hacker News new | ask | show | jobs
by HammadB 981 days ago
Yeah, was thinking the same thing. Something deeper to be explored here esp due to the consistency across domains. I wonder how far we can push "externalizing" higher level information the model wants to store while in the forward pass.