|
|
|
|
|
by lolinder
635 days ago
|
|
That doesn't address the thing they're skeptical about, which is how much knowledge can be encoded in 3B parameters. 3B models are great for text manipulation, but I've found them to be pretty bad at having a broad understanding of pragmatics or any given subject. The larger models encode a lot more than just language in those 70B+ parameters. |
|
I'm pretty sure the AI guys are well aware of which types of models they want to produce. Models that can intake knowledge and intelligently manipulate it would mean general intelligence.
Models that can intake knowledge and only produce subsets of it's training data have a use but wouldn't be general intelligence.