Hacker News new | ask | show | jobs
by marcalc 407 days ago
What do you mean by draft model? And how would one disable it? Cheers
1 comments

A draft model is something that you would explicitly enable. It uses a smaller model to speculatively generate next tokens, in theory speeding up generation.

Here’s the LM Studio docs on it: https://lmstudio.ai/docs/app/advanced/speculative-decoding