Hacker News new | ask | show | jobs
by 88stacks 1229 days ago
This is wonderful, no doubt about it, but the bigger problem is for making this usable on commodity hardware. Stablediffusion only needs 4 GB of RAM to run inference, but all of these large language models are too large to run on commodity hardware. Bloom from huggingface is already out and no one is able to use it. If chatgpt was given to the open source community, we couldn’t even run it…
3 comments

> Bloom from huggingface is already out and no one is able to use it.

This RLHF dataset that is being collected by Open Assistant is just the kind of data that will turn a rebel LLM into a helpful assistant. But it's still huge and expensive to use.

Some people will have the necessary hardware, others will be able to run it in the cloud.

I'm curious how they will get these LLM to work with consumer hardware myself. Is FP8 is the way to get them small?

And there's a 99% chance it will only work on NVIDIA hardware, so even fewer still.