| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by smallerize 473 days ago
	From https://huggingface.co/Qwen/QwQ-32B Presently, vLLM only supports static YARN, which means the scaling factor remains constant regardless of input length, potentially impacting performance on shorter texts. We advise adding the rope_scaling configuration only when processing long contexts is required.

1 comments

Sorry, could you please explain what this means? I'm not into machine learning, so I don't get the jargon.

Well I can't be positive, but it looks like some of the factors that support a long context length might be set wrong. https://blog.eleuther.ai/yarn/