Hacker News new | ask | show | jobs
by thethimble 1236 days ago
It seems like the fine tune dataset to go from GPT -> ChatGPT is pretty valuable, particularly because it is proprietary.

Still, I agree with your characterization that we should see many similar models over time. As an example, see Deepmind’s Sparrow: https://www.deepmind.com/blog/building-safer-dialogue-agents

1 comments

Yes & No.

GPT <> ChatGPT: probably not. It's not hard for other big players to enter this space. It's mostly egg-on-face for Google that they haven't given that Google basically invented the model that OpenAI uses and has big versions internally. There's nothing fundamental stopping Google Docs from adding ChatGPT to their UI and getting way more consumer training data than OpenAI can get without a similar play, or for Apple to do something. Similar to what happened with mapping software, google/microsoft/azure & chinese equivs will all offer with similar competitiveness, and then complements like facebook/salesforce will do more OSS to compete against. That's already begun.

Copilot: The interesting proprietary advantage IMO is program synthesis. It's really enabled by Microsoft VSCode <> Github <> OpenAI. Without even doing any AI investments, the winner of this fight might be Gitlab, as Google/AWS/Saleforce/etc decide what to do. Before gitlab might have been a nice vehicle for shift-left sales (cloud hosting, security scans, ..), but program synthesis UIs can make Software 2.0 real.

> There's nothing fundamental stopping Google Docs from adding ChatGPT to their UI and getting way more consumer training data than OpenAI can get without a similar play.

OpenAI could get exactly the same (or more, idk) data by integrating into Teams, considering the Microsoft partnership.

Totally!

My point is chatgpt isn't a high-moat advantage for text/q&a for microsoft. Their top competitors here have similarly huge UI footprint. In contrast, program synthesis has a much higher data moat.

There are definitely more people using Docs than Teams.

I doubt that Microsoft will allow OpenAI to train on teams data from other businesses.

You might be right, do you have a source?

They are fine with tons of telemetry and candy crush ads on the start bar. There were also other instances were Microsoft shared data before Google.

In addition to that, one could argue they already share date from businesses source code with copilot.

They don't share private GitHub data with copilot. Teams data is default private.

Teams has 270 million monthly users (you can Google it, I'm looking at a geekwire post) and Google has 2 billion monthly g suite users (business insider)