Hacker News new | ask | show | jobs
by llogiq 1082 days ago
Again, there is a lot of words to describe the fact that machine learning is just lossy compression for a bunch of data with the possibility to interpolate between data points and get somewhat plausible results. This means data points may get lost during compression/training, and certain things will look off, whether it be a preference for banana pairs, even numbers of fingers or certain weasel words in verbiage.
2 comments

And is RL not? All of these models are constrained by finite weights and then tuned. Are you suggesting we grow a neural network until certain criterion are met with regard to out of distribution test criteria? Hmm
What would that buy you? You train until you push your loss under a certain threshold, then check the external criteria and if they don't hold you train again? Your external criteria would essentially become another part of your loss function, but the whole training would become vastly more inefficient.

I'm saying that we shouldn't expect the models to come up with things we didn't train them to come up with.

Yeah, it wouldn't buy us anything additional. The only way to really verify would be to look at out of distribution inputs and see if they satisfy what we'd really like. Wouldn't that be true generalization? Otherwise our model architectures or training data is insufficient.
all gradient learning creates kernel machines