Hacker News new | ask | show | jobs
by Gloomily3819 726 days ago
What a misleading article. I thought they'd done some breakthrough in resource efficiency. This is just the old and slow method tools like Ollama used.
2 comments

Do you know how much disk space this takes total? When I ran it, it downloaded nearly 30 gigabytes of models and seemed to be on track to download 28 more 5 gigabyte chunks (for a total of 150 gigabytes of disk space or maybe more). What is the total size before it finishes?
70B parameters * 2 bytes each (fp16 or bf16) = 140GB

I wish models sizes were published in bytes.

Thanks, I finished downloading it (which took many hours) onto an external hard drive (by adding a HF_HOME environmental variable for where to store that cache). Its size was 262 GB.
What method is that? Layer offloading?
Yes, it's either that, or CPU inference. The article doesn't say.

It doesn't mention quantization either.