Hacker News new | ask | show | jobs
by wtbdqrs 745 days ago
I'm neither smart nor a dev, but there is no need to feed the kid the internets data anymore, is there? The kid gets enough data fed to it by direct user input aaand that makes the kid preconfigured well enough to recognize and leave trash where it finds it, except if it can be up- or recycled but thats a long story.

Corporate data, research, books, blogs, any tokens the kid will train itself on will "feel" right in it's stomach and not "too heavy" or "too light" for its semantic mass. The rest of what the rest of the internet might have to offer in the future (comment sections) is so predictable, it would be a duplication of effort the next gen of AI won't waste any RAM on.

1 comments

what you says makes sense, but i still doubt how far can it go with user interactive data. because in a typical conversation between user & LLM, most part of the text is generated by LLM itself.i think the only thing most users usually do is and might always be just inputting the starting question, which might not be very nutritious for LLM to learns anything new, since it's not a feedback. Corporate data might be something, but the data from research, books, blogs are just too tiny to push the wheels, not to mention that bunch of these are definitely gonna be AIGC in the future.