Y
Hacker News
new
|
ask
|
show
|
jobs
by
zaptrem
921 days ago
Iirc Ethereum ASICs were also memory bandwidth bound. With KV caching transformers are just lots and lots of matrix vector multiplication and are bound by loading the huge weight matrices onto the cores.