| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by brucethemoose2 1031 days ago

Llama 34B is just big enough to fit on a 24GB consumer (or affordable server) GPU.

Its also just the right size for llama.cpp inference on machines with 32GB RAM, or 16GB RAM with a 8GB+ GPU.

Basically its the most desirable size for AI finetuning hobbyists, and the quality jump from llama v1 13B to llama v1 33B is huge.