Hacker News new | ask | show | jobs
by rfoo 490 days ago
I know that, I'm in this game. I was comparing API throughput/ttft/ttbt of DeekSeek's own R1 API before it went viral in the West, and o3-mini.

I remain unconvinced that DeepSeek themselves didn't optimize their own V3 inference good enough and left another 2x~3x improvement on the table.

1 comments

I am sure DeepSeek did optimize the inference cost of R1. They did not yet release an efficient MoE downscaling of it, ie an R1-mini.