Hacker News new | ask | show | jobs
by randomtoast 744 days ago
The following is all guess work:

Since the start of their partnership in 2019, OpenAI has primarily utilized Microsoft's Azure data centers for training its models. In 2023, Microsoft acquired approximately 150,000 H100 GPUs. [1]

The initial version of GPT-4 ran on a cluster of A100 GPUs. It is likely that GPT-5 will run on the newly acquired H100 GPUs, and it is plausible that GPT-4 Turbo and GPT-4o also utilize this infrastructure. The inference speed of GPT-5 should not be significantly slower than that of GPT-4 to ensure it remains practical for most applications.

Assuming the H100 is 4.6 times faster for inference than the A100 [2], this gives us a lower bound for performance expectations. I anticipate GPT-5 to be at least five times larger in terms of model parameters. Given that both A100 and H100 have a maximum capacity of 80GB, it is unlikely we will see a single gigantic model. Instead, we can expect an increase in the number of experts. If GPT-4 operates as a mixture of experts with 8x220 billion parameters, then GPT-5 might scale up to something like 40x220 billion parameters. However, the exact release date, safety measures, and benchmark performance of GPT-5 remain uncertain.

[1]: https://www.tomshardware.com/tech-industry/nvidia-ai-and-hpc...

[2]: https://nvidia.github.io/TensorRT-LLM/blogs/H100vsA100.html