| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by wongarsu 1192 days ago
	If you run it with 4-bit quantization completely on the CPU (similar to llama.cpp), ChatGPT should run in about 90 GB of RAM. Which is easy to get your hands on for a desktop, but it's out of reach for notebooks. Also expect performance of couple seconds per token in that setup, for now you need something involving GPUs