Hacker News new | ask | show | jobs
by HDBaseT 31 days ago
There is plenty of other inference providers, but tell me, who is the cheapest?

Model - Deepseek V4 Pro

CHEAPEST PROVIDER: Provider: Deepseek Input Price - $0.435/M tokens Output Price - $0.87/M tokens Cache Read - $0.003625/M tokens

SECOND CHEAPEST: Provider: deepinfra Input Price - $1.30/M tokens Output Price - $2.60/M tokens Cache Read - $0.10/M tokens

Deepinfra is almost 3x more expensive and they are using a fp4 model, with Max 16.4K output (vs 364K) and have significantly lower throughput!