Hacker News new | ask | show | jobs
by tsunamifury 750 days ago
You seem to entirely miss how attention layers work...