Hacker News new | ask | show | jobs
by minxomat 1190 days ago
The full use case includes quantisation, which the repo points out uses a large amount of system RAM. Of course that’s not required if you skip that step.
2 comments

Judging from downloads of the 4bit file and how many people I've seen post about quantizing it themselves, around 99% of people are just downloading the pre-quantized files.

I would not personally call compilation of software part of its "use case." It's use case is text generation.

Quantisation is a once off process. I suspect most people who don't have access to a machine with enough RAM and don't want to use the pre-quantized version can afford the $20 to hire a big cloud server for an day.

Or it is probably possible to make it work slowly using a swapfile on Linux.