| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by crakenzak 829 days ago
	Can’t wait to see some Phi-3 fine tunes! Will be testing this out locally, such a small model that I can run it without quantization. Feels incredible to be living in a time with such neck breaking innovations. What are chances we’ll have a <100B parameter GPT4/Claude Opus model in the next 5 years?

5 comments

nl 828 days ago

> What are chances we’ll have a <100B parameter GPT4/Claude Opus model in the next 5 years?

In 5 years time we'll have adaptive compute and the idea of talking about the parameter count of a model will seem as quaint as talking about the cylinder capacity of a jet engine.

link

regularfry 828 days ago

It feels like it's going to be closer than that. People always forget that GPT4 and Opus have the advantage of behind-the-curtain tool use that you just can't see, so you don't know how much of a knowledge or reasoning leg-up they're getting from their internal tooling ecosystem. They're not really directly comparable to a raw LLM downloaded from HF.

What we need is a standardised open harness for open source LLMs to sit in that gives them both access to tools and the ability to write their own, and that's (comparatively speaking) a much easier job than training up another raw frontier LLM: it's just code, and they can write a lot of it.

link

Deverauxi 828 days ago

5 years? 5 years is a millennia these days.

We’ll have small local models beating gpt-4/Claude opus in 2024. We already have sub 100b models trading blows with former gpt-4 models, and the future is racing toward us. All these little breakthroughs are piling up.

link

refulgentis 828 days ago

Absolutely not on the first one. Not even close.

link

ashirviskas 828 days ago

Why not? There's still 7 months left for breakthroughs.

link

refulgentis 828 days ago

Small leaves wiggle room, but it's extremely unlikely trad small, <= 7B, will get there this year even on these evals.

UX matching is a whole different matter and needs a lot of work: Worked heavily with Llama 8B over last days, and Phi 3 today, and the Q+A benchmarks don't tell the full story. Ex. It's nigh impossible to get Llama _70_B to answer in JSON; when Phi sees RAG from search it goes off inventing new RAG material and a new question.

link

bugglebeetle 828 days ago

We already do. It’s called LLama 3 70B Instruct.

link

vitorgrs 828 days ago

Llama 3 is awful in non-English. 95% of their training data is in English....

GPT is still the king when talking about multiple languages/knowledge.

link

stavros 828 days ago

Is it released?

link