Hacker News new | ask | show | jobs
Major upgrades to Ray Serve: 88% lower latency and 11.1x higher throughput (anyscale.com)
2 points by robertnishihara 81 days ago
1 comments

The real story here is replacing Ray Core's actor task dispatch with direct gRPC for the hot path. Essentially admitting that the general-purpose actor model was the bottleneck for inference workloads. Smart move — let Ray handle orchestration and get out of the way for actual data flow.