Hacker News new | ask | show | jobs
by whimsicalism 478 days ago
sure on consumer GPUs but that is not what is constraining the model inference in most actual industry setups. technically even then, you are CPU-GPU memory bandwidth bound more than just GPU memory, although that is maybe splitting hairs
1 comments

Why are industry setups considered actual while others are not?