| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by int_19h 1138 days ago
	It's not even that bad. Core i7-12700K with DDR5 gives me ~1 word per second on llama-30b - that is fast enough for real-time chat, with some patience. And things are even better on M1/M2 Macs.

1 comments

Joeri 1138 days ago

The critical factor seems to be the ability to fit the whole model in RAM (--mlock option in oobabooga). With Apple's RAM prices most M1/M2 owners probably don't have the 32 GB RAM required to fit a 4bit 30B model.

link

Semaphor 1138 days ago

I have 64 GB RAM, but only a Ryzen 5 3600, and the larger models are very slow ;)

link