|
|
|
|
|
by ilaksh
481 days ago
|
|
There is a limit due to the need to keep model responses nearly instant and the trade off that smaller models that are generally capable of that have. Unless you have unique hardware
Only Cerebras can run medium to large models at truly near instant speed. |
|