Y
Hacker News
new
|
ask
|
show
|
jobs
by
menaerus
324 days ago
Ok, this then goes to say that your approach doesn't work without applying whatever fixes to the vanilla models. What I'm trying to understand is the approach itself. Why does it and how does it work?
1 comments
danielhanchen
324 days ago
Oh I wrote a bit about it in
https://docs.unsloth.ai/basics/unsloth-dynamic-2.0-ggufs
and
https://unsloth.ai/blog/deepseekr1-dynamic
if that helps!
link