Hacker News new | ask | show | jobs
by Eisenstein 643 days ago
It isn't that cloud providers want to shut us out, it is that nVidia wants to relegate AI capable cards to the high end enterprise tier. So far in 2024 they have made $10.44b in revenue from the gaming market, and over $47.5b in the datacenter market, and I would bet that there is much less profit in gaming. In order to keep the market segmented they stopped putting nvlink on gaming cards and have capped VRAM at 24GB for the highest end GPUs (3090 and 4090) and it doesn't look much better for the upcoming 5090. I don't blame them, they are a profit-maximizing corporation after all, but if anything is to be done about making large AI models practical for hobbyists, start with nVidia.

That said, I really don't think that the way forward for hobbyists is maxing VRAM. Small models are becoming much more capable and accelerators are a possibility, and there may not be a need for a person to run a 70billion parameter model in memory at all when there are MoEs like Mixtral and small capable models like phi.