Hacker News new | ask | show | jobs
by chrismustcode 3 days ago
You don't consider Input $0.435 Output $0.87 cache read $0.003625 per million tokens for near frontier intelligence cheap?
2 comments

No. They still have enormous profit margins on inference with these prices.
Their margins doesn't impact my own assessment of end user pricing as cheap.
Any source to backup this claim, pretty please?
Source? There are a countless number of providers serving open weight models for fun and profit.
I highly doubt there is any margin on those inference pricing.
> I highly doubt there is any margin on those inference pricing.

And yet, OpenCode Go offers DeepSeek flash 6 times cheaper than DeepSeek itself. And they claim they are still profitable.

Part of their model is that not everyone will use their entire quota each month. I don't think I will. I use under $1/day with deepseek v4 flash. We get $60 for the $10 sub.
It’s near the frontier meaning it’s the best intelligence for the price.

It’s not even close to frontier meaning it’s the best intelligence.

I hardly notice DeepSeek being inferior to Claude Opus unless I have it working on tricky and under-defined problems. That is, I trust Opus to reason much better when it has the choice. Otherwise, IME DeepSeek is far cheaper and more effective for anything where the solution is even somewhat obvious.
Out of curiosity, what is your stack? And is this in a legacy project or new one?

I have tried using deep seek flash and pro but they make amateur mistakes. Sonnet level at best.

However v4 flash is absolutely amazing as a generalist model and it’s what we’re using on a product built on top of LLMs. I wish I could code with it but it’s not going to happen anytime soon

I've used it across many new projects as well as many legacy ones. It does make amateur mistakes so you can't leave it unsupervised for hours like I do with Claude, but it's so much cheaper that weeks of heavy usage haven't even cost me $10 yet. Only other downside IMO is that Pro is pretty slow, even compared to frontier models; only around 120t/s IIRC.
Yes I also noticed it is pretty slow, which sort of defeated the purpose of using it for me.

Usually I'm working on a large task, typically with Opus, while also having a bunch of smaller tasks in their own independent worktrees. Those still need supervision, but less. My goal was to get deepseek to drive the cost of those down, but it was too slow and unreliable...

Yes, I could tolerate the unreliability better if it were faster, but it's really not. So it's too slow for me to actively supervise it, but too unreliable for me to trust it unsupervised. The shitty middle. I often have multiple of them open at a time and check my terminal every few minutes to lead them along. Mostly works.