Hacker News new | ask | show | jobs
by Yusefmosiah 898 days ago
I'm rooting for open source models too. But I'm also a dev working to use AI models for consumer apps. And currently, open source is quite a ways behind.

Consider some points:

1. GPT-4 was trained in Summer 2022. OpenAI already has better models.

2. It's not just the model, but the infra around it: ChatGPT has tool use — image generation, web search, and API calls through "actions" — built in.

3. More infra: ChatGPT has a builtin moderation endpoint. This is not sexy, and although many of us hackers want uncensored AI, most applications will need some moderation.

4. ChatGPT has >100M users, and there is some lock-in already. ChatGPT users don't want their chats split over multiple apps.

5. Open source (and proprietary models like Grok) are fine-tuning of synthetic data generated by GPT-4. This fine-tuning process limits them to be sub-GPT-4 level.

6. Even the best open source models (eg Mixtral) are significantly worse than GPT-4. Their low cost makes them attractive, but if you believe, as I do, that sub-GPT-4 level models are just not that compelling, the open source AI ecosystem has a lot of catching up to do.

As long as there exist proprietary models that are an order of magnitude more capable than open source models, I expect the bulk of the value and usage will accrue to the ecosystems of the proprietary models. I do hope that at some point soon this changes. Maybe open source models achieve a flywheel of data, crowdsourced algorithmic optimization, and perhaps some form of efficient distributed training on consumer hardware is possible. This would be awesome, IMO.

1 comments

Thanks for your insight.