Hacker News new | ask | show | jobs
by dingclancy 759 days ago
This is the first demo where you can really sense that beating LLM benchmarks should not be the target. Just remember the time when the iPhone has meager specs but ultimately delivered a better phone experience than the competition.

This is the power of the model where you can own the whole stack and build a product. Open Source will focus on LLM benchmarks since that is the only way foundational models can differentiate themselves, but it does not mean it is a path to a great user experience.

So Open Source models like Llama will be here to stay, but it feels more like if you want to build a compelling product, you have to own and control your own model.

3 comments

OpenAI blew up when they released ChatGPT. It was more of a UX breakthrough than pure tech, since GPT3 was available for a few months already.

This feels similar, with OpenAI trying to put their product even more into the daily lives of their users. With GPT4 being good enough for nearly all basic tasks, the natural language and multimodality could be big.

I don’t think Llama being open sourced means Meta has lost anything. If anything it’s just a way to get free community contribution, like Chrome from Chromium. Mega absolutely intends to integrate their version of Llama in their products not so unlike how OpenAI is creating uses for their LLM beyond just the technology
Depends on the benchmarks. AI that can actually do end to end the job of software developers, theoretical computer scientists, mathematicians etc. would be significantly more impactful than this.

I want to see AI moving the state of the art of the world understanding - physics, mathematics etc. - the way it moved state of the art of the Go game understanding.

Doing these end to end jobs still falls on user experience and UI, if we are talking about getting to mass market.

This GPT-4o model is a classic example. It is essentially the same model as GPT-4 but these multimodal features, voice conversations, math, and speed is revolutionary as the creation of the model itself.

Open Source LLM will end up as a model in GitHub and will be used by developers but it looks like even if GPT-4o is only 3 months ahead of other models in terms of benchmarks, the UI + Usecase + Model is 2 years ahead of the competition. And I say that because there is still no chat product that is close to what ChatGPT is delivering now, even though there are models that is close to ChatGPT 4o today.

So if it is sticky for 2 more years, their lead will just grow and we will just end up with more open source models that are technically behind by 3 months but behind product-wise by 2 years.