|
|
|
|
|
by sdesol
260 days ago
|
|
Honestly Gemini Flash Lite and models on Cerebras are extremely fast. I know what you are saying. If the goal is to get a lot of results where they may or may not be relevant, then yes, it is an order of a magnitude slower. If you take into consideration the post analysis process, which is what inference is trying to solve, is it an order of a magnitude slower? |
|