Hacker News new | ask | show | jobs
by mynti 121 days ago
With the rough numbers from the blog post at ~1k tokens a second in Cerebras it should put it right at the same size as GLM 4.7, which also is available at 1k tokens a second. And they say that it is a smaller model than the normal Codex model
1 comments

You can’t extrapolate size of model from speed that way. Architecture difference, load etc will screw up the approximation