|
|
|
|
|
by cbuskilla
2230 days ago
|
|
Sure!
It is the 90M params models and they trained models up to almost 10B params so I guess it gets better with the size (Didn't try way too expensive). And I agree about the alice derivates mitzuku is nice without doing anything fancy. |
|