Y
Hacker News
new
|
ask
|
show
|
jobs
by
simianwords
130 days ago
It’s interesting that they kept the price the same while doing inference on Cerebras is much more expensive.
2 comments
diwank
130 days ago
I dont think this is Cerebras. Running on cerebras would change model behavior a bit and it could potentially get a ~10x speedup and it'd be more expensive. So most likely this is them writing new more optimized kernels for Blackwell series maybe?
link
simianwords
130 days ago
Fair point but it remains to answer - why isn’t this speed up available in ChatGPT and only in the api?
link
chillee
130 days ago
this is almost certainly not being done on cerebras
link