| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ggcr 163 days ago

From the HuggingFace model card [1] they state:

> "In particular, Qwen3.5-Plus is the hosted version corresponding to Qwen3.5-397B-A17B with more production features, e.g., 1M context length by default, official built-in tools, and adaptive tool use."

Anyone knows more about this? The OSS version seems to have has 262144 context len, I guess for the 1M they'll ask u to use yarn?

[1] https://huggingface.co/Qwen/Qwen3.5-397B-A17B

2 comments

NitpickLawyer 163 days ago

Yes, it's described in this section - https://huggingface.co/Qwen/Qwen3.5-397B-A17B#processing-ult...

Yarn, but with some caveats: current implementations might reduce performance on short ctx, only use yarn for long tasks.

Interesting that they're serving both on openrouter, and the -plus is a bit cheaper for <256k ctx. So they must have more inference goodies packed in there (proprietary).

We'll see where the 3rd party inference providers will settle wrt cost.

link

ggcr 163 days ago

Thanks, I've totally missed that

It's basically the same as with the Qwen2.5 and 3 series but this time with 1M context and 200k native, yay :)

link

danielhanchen 163 days ago

Unsure but yes most likely they use YaRN, and maybe trained a bit more on long context maybe (or not)

link