|
|
|
|
|
by energy123
226 days ago
|
|
They perform similarly on benchmarks, which can be fudged to arbitrarily high numbers by just including the Q&A into the training data at a certain frequency or post-training on it. I have not been impressed with any of the DeepSeek models in real-world use. |
|
Anecdata: our product is using a number of these models in production.
[0] https://openrouter.ai/rankings