Hacker News new | ask | show | jobs
by cunac 1939 days ago
depends, if all requests are parallel they should hit different instances and that would distribute load more efficiently instead of pinning to single instance. You would actually get response in 200ms not to mention that your ability to properly size each node is increased. It also enables you to have a grey area where response can be partial and not just fail/pass. As usual YMMW depending on use case.