Hacker News new | ask | show | jobs
Show HN: Multi-region Vertex AI inference router with Cloud Run (medium.com)
1 points by bernieongewe 169 days ago
1 comments

OP here. Seems there are a number of folks struggling with this. Happy to answer questions about the networking setup.
Why not use the global vertex ai endpoints instead of region balancing?

1. This is Google's own recommendation, despite the latency concerns

2. Some models are only available in the global region.

Where/why are folks struggling with this?