Hacker News new | ask | show | jobs
by polyanos 2164 days ago
>Why do you believe they won't release the model? It's Open AI after all.

Because they want to commercialize the model "to cover the costs of research", it's stated in their blog post FAQ[0].

>Google already trained a model which is bigger than GPT-3.

Source?, a fast Google search yielded no results and I'm curious.

[0]https://openai.com/blog/openai-api/

2 comments

They commercialize the API. The model is so large you can't run it on a single GPU, AFAIK. OpenAI developed an infrastructure to run the model efficiently. So many companies would rather use API than deploy the model on their own hardware.

They will probably release the baseline model, not model optimized for deployment. There are many optimizations possible such as precision reduction, pruning, distillation, and they don't have to share these optimizations.

> a fast Google search yielded no results and I'm curious.

List of pre-trained models on huggingface: https://huggingface.co/models

You can see some of them are prefixed with "google". Also ALBERT is from Google Research.

That's just natural language processing, I dunno what they have in other fields.

They trained a >600B parameter translation model (GPT-3 is 175B parameters).

GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding (https://arxiv.org/abs/2006.16668)