|
|
|
|
|
by coldtea
44 days ago
|
|
>This could be right for the current architecture of LLMs, but you can come up with specialized large language models that can more efficiently use tokens for a specific subset of problems by encoding the information differently. That's precisely what happens on the bad side of a S curve. |
|