|
|
|
|
|
by gabrielruttner
726 days ago
|
|
Folks are using us for long-lived tasks traditionally considered background jobs, as well as near-real-time background jobs. Our latency is acceptable for requests where users may still be waiting, such as LLM/GPU inference. Some concrete examples: 1. Repository/document ingestion and indexing fanout for applications like code generation or legal tech LLM agents 2. Orchestrating cloud deployment pipelines 3. Web scraping and post-processing 4. GPU inference jobs requiring multiple steps, compute classes, or batches |
|