Hacker News new | ask | show | jobs
by alfonsodev 592 days ago
Can you elaborate, are those workflows in queue or can they serve multiple users in parallel ?

I think it’s super interesting to know real life workflows and performance of different LLMs and hardware, in case you can direct me to other resources. Thanks !

1 comments

Our use case is atypical, based on what others seem to require. While we serve multiple requests in parallel, our workloads are not 'chat'.