Hacker News new | ask | show | jobs
by fatihturker 100 days ago
One question I'm interested in exploring:

If models become heavily compressed and streamed from SSD, where do people think the real bottleneck moves to — storage bandwidth, memory bandwidth, or kernel efficiency?