| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by pulse7 589 days ago

Options:

A) 128GB RAM with the fastest Intel/AMD CPU, no GPU: you can run big/good models, but very slow (about 0.5 to 3 tokens/second)

B) Fastest Mac with 128GB/192GB: you can run big/good models with moderate speed (like 5-10 tokens/second)

C) 16/32GB RAM + RTX 4090 with 24GB VRAM: you can run smaller (but still good) models very fast - completely in VRAM (20-30 tokens/second)