| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by moffkalast 332 days ago
	Yeah it's only really viable for chat use cases, coding is the most demanding in terms of generation speed, to keep the workflow usable it needs to spit out corrections in seconds, not minutes. I use local LLMs as much as possible myself, but coding is the only use case where I still entirely defer to Claude, GPT, etc. because you need both max speed and bleeding edge model intelligence for anything close to acceptable results. When Qwen-3-Coder lands + having it on runpod might be a low end viable alternative, but likely still a major waste of time when you actually need to get something done properly.