Hacker News new | ask | show | jobs
by andbberger 3261 days ago
But at what point do you have to wonder if we're using the wrong basis? And how do you know that augmenting the data with tiny adversarial perturbations won't just leave the network vulnerable in a different direction?

It's pretty obvious how to build translational symmetry into a net that's still expressive and easy to train (convolution). But you have to spoon feed CNNs rotational and other symmetries by augmenting the training data. What you really want is a model that has all the symmetries your data has built in.

My sense is that the community at large seems to regard DL as a magic blackbox which it really is not. Complete basis of function + finite data = guarantee of wonky interpolation between samples. What you really need to do is restrict the class of expressible functions to those you need - build your prior into the model.

1 comments

This is a huge topic in applying ML in physics and chemistry where we already have a lot of prior detailed knwoledge about the systems we want to describe and it would be silly not to build it into the ML models.
What's the current state of art in this direction? Is there a way to encode equations explicitly prior to training?
People now try to use ML anywhere and everywhere so it's wild west a little. Three examples: [1] uses a standard neural net to represent a many-body wave function, with all the machinery of quantum mechanics on top of that, and reinforcement learning to find the true ground state. [2] uses a handcrafted neural net, which by construction already takes advantage of a lot of prior knowledge, to directly predict molecular energies. [3] uses a simple kernel ridge regression coupled with a sophisticated handcrafted scheme to automatically construct a good basis (set of features) for a given input, to predict molecular energies.

In all these cases, the ML itself is not the target problem, but only a tool, and most effort goes into figuring out where exactly to use ML as a part of a larger problem, and how to encode prior knowledge, either via feature construction or neural net handcrafting.

[1] http://sci-hub.io/10.1126/science.aag2302

[2] http://sci-hub.io/10.1038/ncomms13890

[3] https://arxiv.org/pdf/1707.04146.pdf

Variational inference