Hacker News new | ask | show | jobs
by goldenkey 1082 days ago
And is RL not? All of these models are constrained by finite weights and then tuned. Are you suggesting we grow a neural network until certain criterion are met with regard to out of distribution test criteria? Hmm
1 comments

What would that buy you? You train until you push your loss under a certain threshold, then check the external criteria and if they don't hold you train again? Your external criteria would essentially become another part of your loss function, but the whole training would become vastly more inefficient.

I'm saying that we shouldn't expect the models to come up with things we didn't train them to come up with.

Yeah, it wouldn't buy us anything additional. The only way to really verify would be to look at out of distribution inputs and see if they satisfy what we'd really like. Wouldn't that be true generalization? Otherwise our model architectures or training data is insufficient.