|
|
|
|
|
by thntk
701 days ago
|
|
Anyone know what caused the very big performance jump from Large1 to Large2 in just a few months? Besides, parameter redundancy seems evidenced. Front-tier models used to be 1.8T, then 405B, and now 123B. Would front-tier models in the future be <10B or even <1B, that would be a game changer. |
|