Hacker News new | ask | show | jobs
by itworkslikethat 1278 days ago
I tried them all, and continue trying when new things come out. No open model is capable of even remotely approaching the capabilities of GPT-3 or ChatGPT. Even if you had an equivalent trained model, running it would require enormous hardware.

While Stable Diffusion is pretty comparable to DALL-E 2 even on small hardware (such as M1 MacBook Air), this doesn't hold true for large language models. These need much, much more VRAM.

2 comments

thats a real shame. Any idea why GPT-3 generations differ so much? Maybe eleuther/whatever could look at refining their model in a similar way.
It's most likely because of the actual runtime size of the model. All the open models are sized for consumer-grade devices, and thus 10x-100x smaller than whatever OpenAI runs (probably around 100+ GB in VRAM, maybe even some multiples more). This is one of the main reasons why their API makes business sense - it's not practical at all to run models like GPT-3 yourself, and training them costs incredible amount of money too.
text-davinci-003 is extremely close, or code-davinci-002
None of that model available to run on your hardware