| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by alfonsodev 592 days ago
	Can you elaborate, are those workflows in queue or can they serve multiple users in parallel ? I think it’s super interesting to know real life workflows and performance of different LLMs and hardware, in case you can direct me to other resources. Thanks !

1 comments

Our use case is atypical, based on what others seem to require. While we serve multiple requests in parallel, our workloads are not 'chat'.