| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by zarzavat 64 days ago

The only thing that we are sure can't be highly compressed is knowledge, because you can only fit so much information in given entropy budget without losing fidelity.

The minimal size limits of reasoning abilities are not clear at all. It could be that you don't need all that many parameters. In which case the door is open for small focused models to converge to parity with larger models in reasoning ability.

If that happens we may end up with people using small local models most of the time, and only calling out to large models when they actually need the extra knowledge.

2 comments

idle_zealot 64 days ago

> and only calling out to large models when they actually need the extra knowledge

When would you want lossy encoding of lots of data bundled together with your reasoning? If it is true that reasoning can be done efficiently with fewer parameters it seems like you would always want it operating normal data searching and retrieval tools to access knowledge rather than risk hallucination.

And re: this discussion of large data centers versus local models, do recall that we already know it's possible to make a pretty darn clever reasoning model that's small and portable and made out of meat.

link

Gareth321 63 days ago

I find it difficult to understand the distinction between parametric knowledge and reasoning skills in LLMs. I still think of them as distinct but I understand there is significantly overlap. Arguably, they are the same thing in LLMs. So I would assume that if reasoning is high quality, using RAG could be logical (if much slower). However if the lack of parametric knowledge impacts reasoning, then use of larger models seems warranted. A dumb LLM wouldn't offer sufficient results even with all the RAG in the world.

link

aldonius 63 days ago

I guess we can imagine a pure reasoning model (if that's even the right word any more) with almost zero world-knowledge. How does it know what to look for? How does it do any meaningful communication at all?

So I think it's useful to have an imprecise-but-fairly-accurate set of world knowledge as part of an otherwise reasoning-heavy model. It's a cache.

And if the it's an LLM, or something like that, I think it basically has to have world-knowledge built in, because what is natural language if not communication about the world?

link

dryarzeg 64 days ago

> we already know it's possible to make a pretty darn clever reasoning model

There's is a problem though: we know that it is possible, but we don't know how to (at least not yet and as far as I am aware). So we know the answer to "what?" question, but we don't know the answer to "how?" question.

link

adrianN 64 days ago

I would call brains with the needed support infrastructure small.

link

yorwba 64 days ago

I think you underestimate the amount of knowledge needed to deal with the complexities of language in general as opposed to specific applications. We had algorithms to do complex mathematical reasoning before we had LLMs, the drawback being that they require input in restricted formal languages. Removing that restriction is what LLMs brought to the table.

Once the difficult problem of figuring out what the input is supposed to mean was somewhat solved, bolting on reasoning was easy in comparison. It basically fell out with just a bit of prompting, "let's think step by step."

If you want to remove that knowledge to shrink the model, we're back to contorting our input into a restricted language to get the output we want, i.e. programming.

link