Hacker News new | ask | show | jobs
by kolinko 12 days ago
just read up on how transformers and attention works, and kvq mechanism in attention