Hacker News new | ask | show | jobs
by 9712263 2790 days ago
How could I embed a knowledge/assumption in deep learning neurons? In high level programming language, it would be easy, but tweaking the neurons parameter to embed that knowledge? Sounds more difficult than writing machine code.
4 comments

Any deviation from a series of full-connected layers represents some assumption being made, usually to reduce the size of the parameter space to a subset that is considered more promising.

Convolutions are one example: they assume that proximity correlates with a logical connection.

Note that this is a very useful assumption. Just shuffle the pixels in a photo and try to discern what they show to see how much we rely on that assumption. In fact I'm having trouble coming up with an obvious counterexample[0].

So let's not fall into the trap of these armchair scientists with the big spliff, staring into the distance and intoning trivialities with the air of revelation: "Man.... you're just a slave to your assumptions. What if, like, space and time are one and the same?"

In fact, one could argue that all of AI is an endeavour to find abstract rules defining what's "trivially obvious" to us. You don't have to explain to children that objects in the distance are smaller than when they are close.

Once you succeed with that, it's possible that ML can find a sort of post-modern reality. One that we are blinded to for cultural reasons and the structure of our perception: what if God, for example, appears in the form of seemingly random "pixel errors"? You would easily miss her constant presence due to all the error correction in the pathway of your perception (and also your camera sensors).

But that's the future. Just as art often flourishes within the confines of (often arbitrary) limitations, so do we. And embracing these limitations is not done for reason of ignorance, but expedience.

Depends. Many standard layers express a form of prior knowledge. A CNN layer embeds the assumption of spatial translation invariance, an RNN does the same for temporal translation. Graph Neural Nets have permutation invariance. Assumptions can also be expressed as regularisation terms added to the loss function. One common practice is to initialise a net with the weights of another net trained on a related task - usually CNNs trained on ImageNet, and word embeddings for NLP (though lately it is possible to use deep neural nets such as BERT, ELMo, ULMFiT and OpenAI transformer pre-trained on large text corpora).
This is basically the biggest research question the field of ML has. How to express useful prior knowledge and how to embed it in a network ?

The current answer mostly is "network architectures can express priors". That's what matt's convolutional net example is about.

This is clearly insufficient to solve the many still unsolvable-by-deep-learning problems, so the search is still on for more mechanisms.

The point is that even randomly wired network has an implicit bias. NNs initialize with random weights so even an untrained network is biased.