Hacker News new | ask | show | jobs
by adpirz 1236 days ago
For people more well versed in this space: does GPT / OpenAI more broadly have a meaningful moat? It seems like there will be a number of these models out there and this isn't as great as say Google's up-till-now advantage in search relevancy.
8 comments

A couple points which I don't see elsewhere:

1) They have the best quality model. Better quality means more users. More users means more data. Which means higher quality...

2) operationalizing & scaling these these models is non-trivial. I'm not sure what the state of distillation/pruning is for GPT-3, but I imagine they have figured out some proprietary techniques.

3) It's not just publishing a single model, but making it so people can fine tune and push their own. Because they've gotten good at 2, now anyone can create their own version of GPT customized for their use case.

Will Google or others be able to do the same eventually? Definitely.

The point I'm more making is that it's not just training the model and running it.

I don't view any of those things as a meaningful moat against the other companies with AI labs.

Specifically, training data is not primarily coming from interactions with model. While with RLHF this data might become more important, it is still a very small portion.

I don't know either way, but by way of example that it might be, the Google PageRank patent has expired, yet Google remains valuable because their personalisation of results became a moat.
> but making it so people can fine tune and push their own

How are they making it easy for people to fine tune their own?

https://beta.openai.com/docs/guides/fine-tuning

You can build your own model based on GPT in a way that users don't have to be in the weeds of AI research to do.

I think that if you could conclusively answer that question you would be sipping drinks on a beach somewhere. The people who are investing seems to think so. Also the applications of this tech is broader than search, but still includes it. A company that had a serious chance of eating at Google's search revenue while also generating new revenue streams. What is that worth? What if you already have 1000 products that would benefit from the new capabilities? This is probably an easy investment decision even if Microsoft gains nothing from the actual investment itself.
It seems like the fine tune dataset to go from GPT -> ChatGPT is pretty valuable, particularly because it is proprietary.

Still, I agree with your characterization that we should see many similar models over time. As an example, see Deepmind’s Sparrow: https://www.deepmind.com/blog/building-safer-dialogue-agents

Yes & No.

GPT <> ChatGPT: probably not. It's not hard for other big players to enter this space. It's mostly egg-on-face for Google that they haven't given that Google basically invented the model that OpenAI uses and has big versions internally. There's nothing fundamental stopping Google Docs from adding ChatGPT to their UI and getting way more consumer training data than OpenAI can get without a similar play, or for Apple to do something. Similar to what happened with mapping software, google/microsoft/azure & chinese equivs will all offer with similar competitiveness, and then complements like facebook/salesforce will do more OSS to compete against. That's already begun.

Copilot: The interesting proprietary advantage IMO is program synthesis. It's really enabled by Microsoft VSCode <> Github <> OpenAI. Without even doing any AI investments, the winner of this fight might be Gitlab, as Google/AWS/Saleforce/etc decide what to do. Before gitlab might have been a nice vehicle for shift-left sales (cloud hosting, security scans, ..), but program synthesis UIs can make Software 2.0 real.

> There's nothing fundamental stopping Google Docs from adding ChatGPT to their UI and getting way more consumer training data than OpenAI can get without a similar play.

OpenAI could get exactly the same (or more, idk) data by integrating into Teams, considering the Microsoft partnership.

Totally!

My point is chatgpt isn't a high-moat advantage for text/q&a for microsoft. Their top competitors here have similarly huge UI footprint. In contrast, program synthesis has a much higher data moat.

There are definitely more people using Docs than Teams.

I doubt that Microsoft will allow OpenAI to train on teams data from other businesses.

You might be right, do you have a source?

They are fine with tons of telemetry and candy crush ads on the start bar. There were also other instances were Microsoft shared data before Google.

In addition to that, one could argue they already share date from businesses source code with copilot.

They don't share private GitHub data with copilot. Teams data is default private.

Teams has 270 million monthly users (you can Google it, I'm looking at a geekwire post) and Google has 2 billion monthly g suite users (business insider)

No I don't believe they do, productwise. We'll see soon enough I imagine. The thing is even though I don't think they have a moat in terms of model/product. They have a moat in terms of talent and capital. Only a few teams operate at their scale and sophistication, and it's hard to get there.

I view this as Microsoft paying for talent the same way DeepMind was initially integrated into Google, and at the same time making the bet that this space will continue to be immensely valuable and relevant going forward.

Pretty exciting times all things considered!

Training and the guard rails.

Beyond that, if it becomes built into the (MS) tools that people are using then convenience is going to be a very hard barrier for Google (or anyone else) to overcome.

Google will continue to integrate their own LLMs into their office suite. Microsoft needs OpenAI because their own LLM research hasn’t been as fruitful. I don’t see a huge moat here for Microsoft.

Then again, Microsoft’s office software is the “gold standard” (however poorly deserved) and even with amazing AI features, Google’s stuff lacks in important ways that will keep Microsoft in a strong position with or without AI features.

Google still has it's own platforms. If we take a look at last generation consumer AI's, voice assistants, Google definitely beat Microsoft, and not only because Cortana sucked.

Microsoft may dominate the AI market for office stuff soon, but for general purpose language models Google still has a great shot, especially when it comes to mobile platforms

The problem is that Google's model relies almost entirely on advertising...and AI will simply be almost impossible to wrap into that model. Microsoft doesn't really have that handicap.
G-suite?
I recently wondered if one of the reasons for Google shutting down Stadia, was to quickly ramp their GPU server stockpile to redirect the resources at GPT modelling, to help catch up.
Google is not constrained by GPUs here and likely will train on TPU pods anyways.
Good point, I imagine they would be using those as well. Know of any resources for speed comparisons on similar models?
Great observation.

If it wasn't prescient, it was incredible dumb luck.

There are a lot of finicky things that go into training a model as large as this.

But that knowledge will disperse and is already held in many competitor companies. I do not think that OpenAI has a substantial moat here.

If it was "open" it should not need a moat, nor have one.