Hacker News new | ask | show | jobs
by Havoc 1280 days ago
Thinking the next logical step - chatgpt at edge - could be even more useful.

Though I guess that still has the underlying limitation of compute shortage so could take a while

2 comments

There's a huge difference between diffusion models that were built to be run on commodity hardware and the huge autoregressive models like GPT. You can't even run GPT3 on the cloud without some specialized interconnect.
How do you know this? Not doubting you just curious. I've always been curious about requirements or size of GPT3 because Eluether's GPT-X 20B takes like 40GB VRAM to run and I think it is the closest analogue to GPT-3
Wait you have to peer directly with their network or something?
No, you can’t build a cluster of GPUs to run GPT without special very fast interconnect like InfiniBand. Stable Diffusion can run on a single GPU, like 3090 .
OpenAI has very similar models available in their API.