|
|
|
|
|
by dingocat
497 days ago
|
|
What do you mean there is no such thing as R1-1.5b? DeepSeek released a distilled version based on a 1.5B Qwen model with the full name DeepSeek-R1-Distill-Qwen-1.5B, see chapter 3.2 on page 14 of their research article [0]. [0] https://arxiv.org/abs/2501.12948 |
|