Hacker News new | ask | show | jobs
by Animats 976 days ago
Well, of course. Large language models reflect the average of the input data.
1 comments

An average of the input data would be a scalar value, not a few GB model.

Don't go around making claims about how a instruct-tuned text model works if you've never tried to operate a foundation model. It's very obvious those two aren't doing the same kind of thing; they can't both be "average text".

And of course don't confuse the sampler for the model; you can change those out.