|
|
|
|
|
by LASR
581 days ago
|
|
What you can do with current-gen models, along with RAG, multi-agent & code interpreters, the wall is very much model latency, and not accuracy any more. There are so many interactive experiences that could be made possible at this level of token throughput from 405B class models. |
|
Or is the rulebook a simple rollback?