If you're going to rent a few ec2 gpu instances you might as well funnel things through openrouter. Not that many of us have workflows where trusting an LLM provider is a problem but sending the data to EC2 is not.
As for why, why would you not? Sitting around waiting for a single assistant is inefficient use of time; I tend to have more like 4-10 instances running in parallel.
> Not that many of us have workflows where trusting an LLM provider is a problem but sending the data to EC2 is not.
I'd imagine plenty of people have a problem with trusting fly-by-night inference providers or model owners with opt-out policies [1] [2] about training on your data, who would be more than happy to send data to EC2, or even the same models in Amazon Bedrock.
As for why, why would you not? Sitting around waiting for a single assistant is inefficient use of time; I tend to have more like 4-10 instances running in parallel.