Hacker News new | ask | show | jobs
by airspresso 242 days ago
Definitely a choice to give it low memory bandwidth. Probably to avoid customers thinking it can replace any data center GPU for inference use-cases.