|
|
|
|
|
by PlasmonOwl
912 days ago
|
|
Author is leveraging mental inflexibility to generate an emotional response of denial. Sure, his points are correct but are constrained. Let’s remove 2 constraints and reevaluate: 1 - Babies learn much more with much less
2 - Video training data can be made in theory at incredible rates The questions becomes: why is the author focusing on approaches in AI investigated in like 2012? Does the author think SOTA is text only? Are OpenAI or other market leaders only focusing on text? Probably not. |
|
If babies learn much more from much less, isn't that evidence that the LLM approach isn't as efficient as whatever approach humans implement biologically, so it's likely LLM processes won't "scale to ago"?
For video data, that's not how LLMs work(or any NNs for that matter). You have to train them on what you want them to look at, so if you want them to predict the next token of text given an input array, you need to train it on the input arrays and output tokens.
You can extract the data in the form you need from the video content, but presumably that's already been done for the most part, since video transcripts are likely included in the training data for gpt.