|
|
|
|
|
by llm_trw
622 days ago
|
|
There isn't one. The real challenge for at scale inference is that the compute for models is too long to keep normal API connections open and you need a message passing system in place. This system also needs to be able to deliver large files for multi-modal models if it's not going to be obsolete in a year or two. I build a proof of concept using email of all things but could never get anyone to fund the real deal which could run at larger than web scale. |
|
An example use with AWS Bedrock: https://temporal.io/blog/amazon-bedrock-with-temporal-rock-s...