Hacker News new | ask | show | jobs
by antirez 3219 days ago
An alternative approach that works for every resolution: http://antirez.com/news/113
8 comments

This is a good post, kudos on writing it up so quickly! In fact, the first thing I thought of when I read this post on HN was, "Well why not use a pseudorandom permutation instead of a pseudorandom function, this way we efficiently fill all pixels without first checking if they're red?"

Being that a feistel network is a pseudorandom permutation, this fulfills the need (and in my opinion, even more elegantly than a LFSR). For even better performance you could use AES as the PRF, especially if users have AES-NI instructions available for acceleration. Then use a basic Feistel network for the PRP.

Thanks! Exactly my thought indeed. Probably now that many people are aware of crypto primitives, permutation boxes and other related tools it is an immediate thought to have, but potentially back then when the game was written it was not so obvious.
Absolutely. One of my recent research interests in cryptography has been identifying ways to make pseudorandom permutations faster and using them outside of typical cryptographic contexts. I'm happy to see that you use them in Rax. They map very well to a lot of data structure problems in networking, not just encyption (i.e. assigning IDs to users).
>This is a good post, kudos on writing it up so quickly!

Wait... what? It took him 25 years! ;)

> A good tool to have in a programmer mental box.

Relates well to Shannon's problem solving process' [1] "Step 2: Fill your 'mental matrix' with solutions to similar problems."

[1] http://www.businessinsider.com/engineer-claude-shannon-probl...

Always love seeing applications of Feistel cipher. Used it with AES as the PRF for implementing FPE in legacy systems.

Just want to note that this approach (regardless of PRF) probably wouldn't have worked in 1991. Recomputing the cipher state at every pixel is probably ~10x slower than the single shift + xor in the iterative LFSR approach.

Hello, yes the approach is slower in my implementation, but I've the feeling that a suitable F (much simpler) and a low number of rounds could do the trick. However the highlight in the original post was that the ports were not able to reproduce the effect. Given that the ports are AFAIK successive and use higher resolutions, I bet that the CPU was not an issue in that case.

EDIT: I just randomly checked that 4 rounds of F = ((r * 31) ^ (r >> 3)) & 0xff provide more or less the same result. Multiplying for 31 is the same as shifting 5 bits on the left and subtracting the number again, so it's just 4 rounds of bit shifting and xor.

You should time it :)
No doubt the original code is still faster :-)
You state that a Feistel network has the property that each input value is mapped to a different output value, but how do you ensure that there isn't some cycle whose length is shorter than that of the size of the set of possible inputs? That is to say, what guarantees that every pixel is reached at least once?
Feistel is a permutation. That means it's a 1:1 mapping between a 16-bit # to another 16-bit #.

You run Feistel for each number 0->65535 (corresponding to the "stage" of the FizzleFade) and out comes the pixel to redden at that stage. Since it's a 16-bit number, some values will fall outside of 320x240 resolution, and you ignore those.

Great TLDR!
Hello, please check the chillingeffect comment replies, it is basically the same question. There are no cycles since it's not a generator where the previous number is the seed for the next. It's a transformation which is invertible and guaranteed to be unique by the Feistel network structure itself.
How do you pick and test the parameters in the F() transformation to guarantee that "Every input 16 bit input will generate a different 16 bit output?" your comment says they were picked at random... was that after some iterations? how did you know when you had a proper transformation?

Thank you.

It might help if you look at a diagram such as this one to understand exactly what the code is doing: https://en.m.wikipedia.org/wiki/Feistel_cipher#/media/File%3...

So you're basically using what's known as the Luby-Rackoff construction on a provable pseudorandom function to create a pseudorandom permutation. A pseudorandom function generates output that appears random but which can repeat, which is why it cannot be invertible, and is thus unsuitable as a block cipher (you need to be able to decrypt a ciphertext to a specific plaintext).

A pseudorandom function is used as the round function in the feistel network (in the diagram, that's denoted by the F in the middle). You seed the pseudorandom function with a key K. Because the Feistel network successively transforms L and R in each round (L0, R0, L1, R1 and so on), it can be proven that even when the PRF F generates an output that has already been used, the Feistel structure will transform that output differently than the last time it was used.

In other words, the function F is not itself invertible. Invertibility is provided by the surrounding Feistel structure, because if F was already a permutation you wouldn't need anything else. F is only required to generate pseudorandom output, and the Feistel structure's additional logic is what grants invertibility to it. This is the "magic" of the Luby-Rackoff construction, which allows you to take any PRF and transform it into a PRP.

Hello, you don't have to pick an invertible F(), the way the L and R sides are combined together leads automatically to the network to be invertible. This is the magic of Feistel networks, that F() can be as complex as you want and can be not invertible at all.
If you'd like to see it in action, I made a CodePen to demonstrate antirez's pixel dissolve over a Castle Wolfenstein 3D scene: https://codepen.io/jones1618/pen/YxRVpo
"just scramble the address" was the solution that immediately came to mind. feistelNet looks a little heavy though, is it really necessary? would you not get the same effect from straight xor on the x and y coords?
A permutation like that reduces biases in the output
Couldn't that algorithm equally be used to circumvent the problem of slowing down a file systems the more the available space fills up? I imagine this would be beneficial when storing but detrimental when reading data.