|
|
|
|
|
by seldo
656 days ago
|
|
For agentic use cases, where you might need several round-trips to the LLM to reflect on a query, improve a result, etc., getting fast inference means you can do more round-trips while still responding in reasonable time. So basically any LLM use-case is improved by having greater speed available IMO. |
|