| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hansvm 337 days ago
	Yeah, we do 100k ML inferences per second. It's not a single server, but the architecture isn't much more complicated than that. With today's computers, indexing the entire internet and serving 100k QPS also isn't really that demanding architecturally. The vast majority of current implementation complexity exists for reasons other than necessity.