| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Gormo 59 days ago

> The camera analogy is a good one but I have never had a camera that had every great picture somebody else had taken, plus every work of art, baked into it.

I've never had an LLM that had any of that baked into it either. LLMs just have token correlations trained on those works. Trying to get an LLM to output the data it was trained on verbatim is something I'd expect to be heading into monkeys-on-typewriters territory. "Write something in the style of Shakespeare" and "give me the original text of Hamlet" are two very different things.

> I agree with the framing of the AI as a tool not an autonomous entity. The thing is, to me, it is exactly that framing that makes it so the use of that tool means "copying" more than it means "learning and taking inspiration and creating new art", because who is doing the learning and being inspired?

It's not learning or taking inspiration, though. It's just making statistical inferences based on token correlations. Whether or not that's analogous to how humans learn is something I think is a metaphysical question that is of little practical relevance. The fact remains that LLMs are not human, have no intentions of their own, do not exercise any kind of agency despite how often people employing the misnomer "agentic", and are ultimately glorified statistical models.

The LLM is a tool that extends human capacities in the same way as any other mathematical framework or technological device.

> I think of a trained AI like a lossy, highly compressed copy of its training data set.

I've seen a few people in this thread make that argument, but I just can't agree with it. It's not compression, lossy or lossless, which aims to deterministically encode a representation of the specific input data. The training data is analogous to the sample set used in a regression analysis to generate a polynomial function -- it's not valid to treat the output from any application of that polynomial as a copy of the original sample data.