Hacker News new | ask | show | jobs
by 6gvONxR4sf7o 2270 days ago
You're referring to about whether a generic one-size-fits-all model will do well, but ML is full of bespoke models. It would be simple to build a neural network that can compute (and differentiate through) the max function to within some arbitrary epsilon, even though the most generic model (feed forward network) won't do great.
1 comments

> l (feed forward network) won't do great.

See my answer below, in the case of this problem a generic feed-forward network, even a simple one, will work.

Not any ffn, but assuming you are using an efficient architecture search it will probably find one that works.

There's other numerical problems where this doesn't hold but that's another story.

You demonstrated it for a reeeeeeeallly constrained version of the problem. Do you expect your solution would generalize to many lists? Because it would be easy to make a neural network that does, while your toy example (and larger generalizations) probably won't generalize super well.

x_i = ith list element from list x

y = sum(x_i * softmax(k * x)_i)

This one parameter, arbitrarily wide network one will get arbitrarily close to the max function.

This is a super toy version of why attention is so effective. It can pick stuff.