Hacker News new | ask | show | jobs
by Terr_ 495 days ago
This occurs because of ambiguous language which conflates the LLM algorithm with the training-data and the derived weights.

The mysterious part involves whatever patterns might naturally exist within bazillions of human documents, and what partial/compressed patterns might exist within the weights the LLM generates (on training) and then later uses.

Analogy: We built a probe that travels to an alien planet, mines out crystal deposits, and projects light through those fragments to show unexpected pictures of the planet's past. We know exactly how our part of the machine works, and we know the chemical composition of the crystals, but...

1 comments

I very much like this analogy. Thank you for making this clearer in my mind.