|
|
|
|
|
by hansvm
286 days ago
|
|
Yeah, we do 100k ML inferences per second. It's not a single server, but the architecture isn't much more complicated than that. With today's computers, indexing the entire internet and serving 100k QPS also isn't really that demanding architecturally. The vast majority of current implementation complexity exists for reasons other than necessity. |
|