|
It is interesting how persistently dominant GPT-4 is: https://twitter.com/lmsysorg/status/1735729398672716114 Off the top of my head, I can think for at least five foundation models (Llama, Claude, Gemini, Falcon, Mistral) that are all trading blows, but GPT is still a head above them and has been for a year now. Transformer LLMs are simple enough that, demonstrably, anyone with a million bucks of GPU time can make one, but they can't quite catch up with OpenAI. What's their special sauce? |
I’m speculating here but I think Google always refrains from getting into the manual side of things. With LLMs, it became obvious so fast that data is what matters. Seeing Microsoft’s phi-2 play, I’m convinced more about this.
DeepMind understood the properties, came up with Chinchilla but DeepMind couldn’t integrate well with Google, in terms of understanding what kind of data Google should supply to increase model quality.
OpenAI put annotation/cleaning work almost right from the start. Not too familiar with this but human labor was heavily utilized to increase training data quality after ChatGPT started.