Hacker News new | ask | show | jobs
by crakenzak 781 days ago
Can’t wait to see some Phi-3 fine tunes! Will be testing this out locally, such a small model that I can run it without quantization.

Feels incredible to be living in a time with such neck breaking innovations. What are chances we’ll have a <100B parameter GPT4/Claude Opus model in the next 5 years?

5 comments

> What are chances we’ll have a <100B parameter GPT4/Claude Opus model in the next 5 years?

In 5 years time we'll have adaptive compute and the idea of talking about the parameter count of a model will seem as quaint as talking about the cylinder capacity of a jet engine.

It feels like it's going to be closer than that. People always forget that GPT4 and Opus have the advantage of behind-the-curtain tool use that you just can't see, so you don't know how much of a knowledge or reasoning leg-up they're getting from their internal tooling ecosystem. They're not really directly comparable to a raw LLM downloaded from HF.

What we need is a standardised open harness for open source LLMs to sit in that gives them both access to tools and the ability to write their own, and that's (comparatively speaking) a much easier job than training up another raw frontier LLM: it's just code, and they can write a lot of it.

5 years? 5 years is a millennia these days.

We’ll have small local models beating gpt-4/Claude opus in 2024. We already have sub 100b models trading blows with former gpt-4 models, and the future is racing toward us. All these little breakthroughs are piling up.

Absolutely not on the first one. Not even close.
Why not? There's still 7 months left for breakthroughs.
Small leaves wiggle room, but it's extremely unlikely trad small, <= 7B, will get there this year even on these evals.

UX matching is a whole different matter and needs a lot of work: Worked heavily with Llama 8B over last days, and Phi 3 today, and the Q+A benchmarks don't tell the full story. Ex. It's nigh impossible to get Llama _70_B to answer in JSON; when Phi sees RAG from search it goes off inventing new RAG material and a new question.

We already do. It’s called LLama 3 70B Instruct.
Llama 3 is awful in non-English. 95% of their training data is in English....

GPT is still the king when talking about multiple languages/knowledge.

Is it released?