| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Taek 1214 days ago
	For reference, GPT-NeoX is a 20B parameter model, and it runs on 45 GB of VRAM. On an 80 GB A100 you could probably run a 35B parameter model. Maybe 8 A100 cards to do inference on ChatGPT? Or 32 3090 cards, which would run you under $40k total.

1 comments

20B GPT-NeoX runs on a 3090 in 8 bit mode