| Yeah I'm not sure what the exact context of the statement is. I am absolutely certain that we have not already discovered let alone implemented the best possible learning algorithms. Humans have had more time to evolve, there's a great chance that we do learn more efficiently, and have developed specialized brains that are primed to learning things like how to navigate the physical world on planet Earth as bipeds. That said, to say that we operate with less training data is just ignoring the reality of all the data we're training on at all times. If we were to model in lossless fidelity what humans are capable of seeing, hearing, smelling, tasting, feeling, thinking consciously and subconsciously etc. essentially all the data flowing through our minds that we are constantly training on every moment of every day, even while we sleep/are unconscious, what sort of bitrate do you think would be required? Modern LLMs train on datasets in the what, tens of terabytes in size? Let's call it 100 TB. I would imagine that to losslessly reproduce the full suite of human sensory data (whatever that means for things like taste, touch, smell) would require a bitrate that hits that 100 TB total relatively quickly? |
LLMs needed how much training data to be able to do so?
FWIW, I still see them make up wrong words not following any grammatical pattern, esp in Serbian with less training data.
Serbian is pretty complex though: https://www.languagegrowth.com/en/blog/serbian-grammar-basic... — this made it even more surprising to see the kids pick them up so early when their vocabulary is probably not 2000 words yet.