| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by alecco 27 days ago
	PSA: Don't use OpenRouter for DeepSeek V4 as it messes up you caching. Use DeepSeek API directly and you'll get 2x to 3x more cached tokens.

3 comments

numlocked 27 days ago

Can you share more? I'm with OpenRouter and we would love to address this! We don't see this in our own testing, I don't believe -- but will share this feedback and dig in.

link

alecco 26 days ago

Just try. In a case last week it was ~3x and I tried multiple providers: deepseek, gmicloud/fp8, novita/fp8, and another one I can't remember. It was a large job where at least 2/3rds of the start of the prompts was exactly the same (literally a static string).

Then I read somewhere (I think X) that OpenRouter adds stuff and breaks caching (telemetry? headers? can't remember). So I stopped the job, switched to actual DeepSeek provider, and voilá, caching 3x more tokens per request (on average).

link

alecco 26 days ago

> switched to actual DeepSeek provider

I meant actual DeepSeek API.

link

bwfan123 26 days ago

Here is some data from my experience using both deepseek v4 flash directly, and deepseek v4 flash via openrouter.

Directly: 135M input tokens - $0.57 (134M cached)

Via OpenRouter 6M tokens - $0.81 (caching stats & inp/out not reported)

Caching is a huge win with using deepseek directly.

link

phainopepla2 26 days ago

I am experiencing this using Opencode. Caching works fine via Deepseek API but not so good via Openrouter

link

SV_BubbleTime 27 days ago

When you say Deepseek API, you mean servers in China? Or is it a copy of the model operated and run by OpenRouter?

link

jaggs 26 days ago

Yes, I definitely noticed a problem with openrouter and deepseek v4 pro. It's much more expensive.

link