| I've run the R1 local one (the 600B one) and it does do similar refusals like in the article. Basically I observed pretty much the same things as the article in my little testing. I used "What is the status of Taiwan?" and that seemed to rather reliably trigger a canned answer. But when my prompt was literally just "Taiwan" that gave a way less propagandy answer (the think part was still empty though). I've also seen comments that sometimes in the app it starts giving answer that suddenly disappears, possibly because of moderation. My guess: the article author's observations are correct and apply on the local R1 too, but also if you use the app, it maybe has another layer of moderation. And yeah really easy to bypass. I used the R1 from unsloth-people from huggingface, ran on 256GB server, with the default template the model has inside inside its metadata. If someone wants to replicate this, I have the filename and it looks like: DeepSeek-R1-UD-Q2_K_XL-00001-of-00005.gguf for the first file (it's in five parts), got it from here: https://huggingface.co/unsloth/DeepSeek-R1-GGUF (Previously I thought quants of this level would be incredibly low quality, but this seems to be somewhat coherent.) Edit: reading sibling comments, somehow I didn't realize there also exists something called "DeepSeek-R1-Zero" which maybe does not have the canned response fine-tuning? Reading huggingface it seems like DeepSeek-R1 is "improvement" over the zero but from a quick skim not clear if the zero is a base model of some kind, or just a different technique. |