| Great article. >Now, you still want to train the best model you can by cleverly leveraging as much compute as you can and as many trillion tokens of high quality training data as possible, but that's just the beginning of the story in this new world; now, you could easily use incredibly huge amounts of compute just to do inference from these models at a very high level of confidence or when trying to solve extremely tough problems that require "genius level" reasoning to avoid all the potential pitfalls that would lead a regular LLM astray. I think this is the most interesting part. We always knew a huge fraction of the compute would be on inference rather than training, but it feels like the newest developments is pushing this even further towards inference. Combine that with the fact that you can run the full R1 (680B) distributed on 3 consumer computers [1]. If most of NVIDIAs moat is in being able to efficiently interconnect thousands of GPUs, what happens when that is only important to a small fraction of the overall AI compute? [1]: https://x.com/awnihannun/status/1883276535643455790 |
Imagine having 300. Could you build even better models? Is DeepSeek the right team to deliver that, or can OpenAI, Meta, HF, etc. adapt?
Going to be an interesting few months on the market. I think OpenAI lost a LOT in the board fiasco. I am bullish on HF. I anticipate Meta will lose folks to brain drain in response to management equivocation around company values. I don't put much stock into Google or Microsoft's AI capabilities, they are the new IBMs and are no longer innovating except at obvious margins.