|
|
|
|
|
by seewhydee
3 days ago
|
|
That wouldn't explain why Deepseek is fast relative to other Chinese providers, especially considering that they're reportedly ahead of the curve among Chinese companies in moving off Nvidia. I think their quant fund background has more to do with it. Their models are clearly designed with performant inference clearly in mind. |
|
https://x.com/ljupc0/status/2062457314414587996
Other local models I've checked drop to unusable speeds way sooner. Only other model with similarity favourable curve I've tried is nemotron-cascade-2-30b-a3b. But it's a small model, way dumber than DS4F.
Coding agents use cases have large context depths. The rate of decline is as important as the headline number.