| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by turmeric_root 1205 days ago
	More VRAM => larger models. IME it is absolutely worth maxing out VRAM for the significant improvement in quality, especially with LLaMA (though even with a 4090, you won't be able to run the largest 65-billion parameter model even with 4-bit quantization). That said, I recommend renting a cloud GPU for a few hours and trying the larger models on them before buying a GPU of your own, just to see if the models meet your requirements.

1 comments

But should fit easily on a Apple MBP or Studio with 96GB or 128GB of unified memory.