| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lolinder 635 days ago
	That doesn't address the thing they're skeptical about, which is how much knowledge can be encoded in 3B parameters. 3B models are great for text manipulation, but I've found them to be pretty bad at having a broad understanding of pragmatics or any given subject. The larger models encode a lot more than just language in those 70B+ parameters.

1 comments

cyanydeez 635 days ago

Ok, but what we are probably debating is knowledge versus wisdom. Like, if I know 1+1 = 2, and I know the numbers 1 through 10, my knowledge is just 11, but my wisdom is infinite in the scope of integer addition. I can find any number, given enough time.

I'm pretty sure the AI guys are well aware of which types of models they want to produce. Models that can intake knowledge and intelligently manipulate it would mean general intelligence.

Models that can intake knowledge and only produce subsets of it's training data have a use but wouldn't be general intelligence.

BoorishBears 635 days ago

I don't think this is right.

Usually the problem is much simpler with small models: they have less factual information, period.

So they'll do great at manipulating text, like extraction and summarization... but they'll get factual questions wrong.

And to add to the concern above, the more coherent the smaller models are, the more likely they very competently tell you wrong information. Without the usual telltale degraded output of a smaller model it might be harder to pick out the inaccuracies.