|
|
|
|
|
by UncleOxidant
76 days ago
|
|
Something in the 60B to 80B range would still be approachable for most people running local models and also could give improved results over 31B. Also, as I understand it the 26B is the MOE and the 31B is dense - why is the larger one dense and the smaller one MOE? |
|