|
|
|
|
|
by suninsight
403 days ago
|
|
That is very accurate with what we have found. <thinking> models do a lot better, but with huge speed drops. For now, we have chosen accuracy over speed. But speed drop is like 3-4x - so we might move to an architecture where we 'think' only sporadically. Everything happening in the LLM space is so close to how humans think naturally. |
|