| HN Mirror

I did a review for PixelCNN as a part of my summer internship, it covers a bit about how careful masking can be used to create a chain of conditional probabilities [0], which AFAIK is exactly how this "causal convolution" works (can't have dependencies in the 'future'). The PixelCNN and PixelRNN papers also cover this in a fair bit of detail. Ishaan Gulrajani's code is also a great implementation reference for PixelCNN / masking [1].

[0] https://github.com/tensorflow/magenta/blob/master/magenta/re...

[1] https://github.com/igul222/pixel_rnn/blob/master/pixel_rnn.p...