| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by awuji 336 days ago
	You can already run a large LLM (like sonnet 3.5) locally on CPU with 128GB of ram which is <300 USD, but can be offset by swap space. Obviously, response speed is going to be slower, but I can't imagine people will pay much more than 20 USD for waiting 30-60 seconds longer for a response. And obviously consumer hardware is already being more optimized for running models locally.