Hacker News new | ask | show | jobs
by Avicebron 35 days ago
That's a scary thought, llm's training on llm output. People trained by default of ubiquity to think and read llm output produce their own llm-esque writing.

Seems stifling. We'll need someway to reward human creativity and out-of-bounds thinking before our greatest corpus of human intellect is a bounded by whenever and whatever was trained on.

4 comments

Writing and later the printing press have already considerably stifled human expressiveness. Language used to be noch more fragmented and diverse before mass media (or the Bible in every household). In my grandmother’s time you would have difficulty understanding people from three villages down the road.
I'm not sure enabling people three villages apart to communicate with each other counts as "stifling human expressiveness"
I’m not sure that having people read LLM output does that either.
So is it that humans are inherently creative, machines could never do what we do? Or is it that humans will only replicate our training data, and so we have to ensure that machines don't bound our training data? Or are you going meta and gently pointing out the absurdity? (I hope it's this one!)
I think I have an answer. Human's don't have "training data" in the same way we think of LLMs, yes you can walk outside your house and quantify every electromagnetic pulse, random pertubation etc and then "train on it". But that isn't how people process information. We have the ability to process our entire "existence" if that makes sense, which means the density is much higher.

The LLM is bounded by it's training data, and relying on it means we are as well.

I don't understand this mindset, why is it people on here think humans have some kind of magical ability machines don't or can't? Five years ago I would never have predicted this kind of human chauvinism here. It's some kind of weird romanticism almost.
Maybe because everything LLM-written is written in the same style with no creativity, diversity, or idiosyncrasies? If all humans suddenly started writing in a single, bland, corporate style, that would be a tragedy, LLMs or not.
Because right now humans do have a magical ability machines don't. LLMs are a fuzzy reflection of what they've seen hundreds of times already, they don't have originality or intelligence (yet).

As a much more immediate practical matter, LLMs trained on LLM output makes them worse overall, they degrade from doing that. So the more LLM-prodoced content fills the web, the less useful it is as a data source for future LLM training. In addition to just being increasingly boring and vapid.

Saying they don't posses any level of intelligence is wild.
The intelligence is an emergent property of their ability to predict how a statement will proceed, therefore it is inevitably a reiteration or transformation at best. Lots of intelligent things can be produced from that, but nothing truly novel.
Human creativity is not only not being rewarded, but people are increasingly talking like consuming too few tokens is something that's actively used against them.