Hacker News new | ask | show | jobs
by alecco 27 days ago
PSA: Don't use OpenRouter for DeepSeek V4 as it messes up you caching. Use DeepSeek API directly and you'll get 2x to 3x more cached tokens.
3 comments

Can you share more? I'm with OpenRouter and we would love to address this! We don't see this in our own testing, I don't believe -- but will share this feedback and dig in.
Just try. In a case last week it was ~3x and I tried multiple providers: deepseek, gmicloud/fp8, novita/fp8, and another one I can't remember. It was a large job where at least 2/3rds of the start of the prompts was exactly the same (literally a static string).

Then I read somewhere (I think X) that OpenRouter adds stuff and breaks caching (telemetry? headers? can't remember). So I stopped the job, switched to actual DeepSeek provider, and voilá, caching 3x more tokens per request (on average).

> switched to actual DeepSeek provider

I meant actual DeepSeek API.

Here is some data from my experience using both deepseek v4 flash directly, and deepseek v4 flash via openrouter.

Directly: 135M input tokens - $0.57 (134M cached)

Via OpenRouter 6M tokens - $0.81 (caching stats & inp/out not reported)

Caching is a huge win with using deepseek directly.

I am experiencing this using Opencode. Caching works fine via Deepseek API but not so good via Openrouter
When you say Deepseek API, you mean servers in China? Or is it a copy of the model operated and run by OpenRouter?
Yes, I definitely noticed a problem with openrouter and deepseek v4 pro. It's much more expensive.