Hacker News new | ask | show | jobs
by astrange 972 days ago
An average of the input data would be a scalar value, not a few GB model.

Don't go around making claims about how a instruct-tuned text model works if you've never tried to operate a foundation model. It's very obvious those two aren't doing the same kind of thing; they can't both be "average text".

And of course don't confuse the sampler for the model; you can change those out.