| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cootsnuck 300 days ago
	Super helpful to see actual examples of what it (roughly) can look like to deploy production inference workloads, and also the latest optimization efforts. I consult in this space and clients still don't fully understand how complex it can get to just "run your own LLM".