FizzleFade | HN Mirror

An alternative approach that works for every resolution: http://antirez.com/news/113

dsacco 3219 days ago

This is a good post, kudos on writing it up so quickly! In fact, the first thing I thought of when I read this post on HN was, "Well why not use a pseudorandom permutation instead of a pseudorandom function, this way we efficiently fill all pixels without first checking if they're red?"

Being that a feistel network is a pseudorandom permutation, this fulfills the need (and in my opinion, even more elegantly than a LFSR). For even better performance you could use AES as the PRF, especially if users have AES-NI instructions available for acceleration. Then use a basic Feistel network for the PRP.

Thanks! Exactly my thought indeed. Probably now that many people are aware of crypto primitives, permutation boxes and other related tools it is an immediate thought to have, but potentially back then when the game was written it was not so obvious.

dsacco 3219 days ago

Absolutely. One of my recent research interests in cryptography has been identifying ways to make pseudorandom permutations faster and using them outside of typical cryptographic contexts. I'm happy to see that you use them in Rax. They map very well to a lot of data structure problems in networking, not just encyption (i.e. assigning IDs to users).

0x4a42 3206 days ago

>This is a good post, kudos on writing it up so quickly!

Wait... what? It took him 25 years! ;)

[1] http://www.businessinsider.com/engineer-claude-shannon-probl...

wyldfire 3219 days ago

> A good tool to have in a programmer mental box.

Relates well to Shannon's problem solving process' [1] "Step 2: Fill your 'mental matrix' with solutions to similar problems."

first_amendment 3219 days ago

Always love seeing applications of Feistel cipher. Used it with AES as the PRF for implementing FPE in legacy systems.

Just want to note that this approach (regardless of PRF) probably wouldn't have worked in 1991. Recomputing the cipher state at every pixel is probably ~10x slower than the single shift + xor in the iterative LFSR approach.

Hello, yes the approach is slower in my implementation, but I've the feeling that a suitable F (much simpler) and a low number of rounds could do the trick. However the highlight in the original post was that the ports were not able to reproduce the effect. Given that the ports are AFAIK successive and use higher resolutions, I bet that the CPU was not an issue in that case.

EDIT: I just randomly checked that 4 rounds of F = ((r * 31) ^ (r >> 3)) & 0xff provide more or less the same result. Multiplying for 31 is the same as shifting 5 bits on the left and subtracting the number again, so it's just 4 rounds of bit shifting and xor.

first_amendment 3219 days ago

You should time it :)

No doubt the original code is still faster :-)

Orangeair 3219 days ago

You state that a Feistel network has the property that each input value is mapped to a different output value, but how do you ensure that there isn't some cycle whose length is shorter than that of the size of the set of possible inputs? That is to say, what guarantees that every pixel is reached at least once?

first_amendment 3219 days ago

Feistel is a permutation. That means it's a 1:1 mapping between a 16-bit # to another 16-bit #.

You run Feistel for each number 0->65535 (corresponding to the "stage" of the FizzleFade) and out comes the pixel to redden at that stage. Since it's a 16-bit number, some values will fall outside of 320x240 resolution, and you ignore those.

Great TLDR!

Hello, please check the chillingeffect comment replies, it is basically the same question. There are no cycles since it's not a generator where the previous number is the seed for the next. It's a transformation which is invertible and guaranteed to be unique by the Feistel network structure itself.

chillingeffect 3219 days ago

How do you pick and test the parameters in the F() transformation to guarantee that "Every input 16 bit input will generate a different 16 bit output?" your comment says they were picked at random... was that after some iterations? how did you know when you had a proper transformation?

Thank you.

dsacco 3219 days ago

It might help if you look at a diagram such as this one to understand exactly what the code is doing: https://en.m.wikipedia.org/wiki/Feistel_cipher#/media/File%3...

So you're basically using what's known as the Luby-Rackoff construction on a provable pseudorandom function to create a pseudorandom permutation. A pseudorandom function generates output that appears random but which can repeat, which is why it cannot be invertible, and is thus unsuitable as a block cipher (you need to be able to decrypt a ciphertext to a specific plaintext).

A pseudorandom function is used as the round function in the feistel network (in the diagram, that's denoted by the F in the middle). You seed the pseudorandom function with a key K. Because the Feistel network successively transforms L and R in each round (L0, R0, L1, R1 and so on), it can be proven that even when the PRF F generates an output that has already been used, the Feistel structure will transform that output differently than the last time it was used.

In other words, the function F is not itself invertible. Invertibility is provided by the surrounding Feistel structure, because if F was already a permutation you wouldn't need anything else. F is only required to generate pseudorandom output, and the Feistel structure's additional logic is what grants invertibility to it. This is the "magic" of the Luby-Rackoff construction, which allows you to take any PRF and transform it into a PRP.

Hello, you don't have to pick an invertible F(), the way the L and R sides are combined together leads automatically to the network to be invertible. This is the magic of Feistel networks, that F() can be as complex as you want and can be not invertible at all.

jones1618 3219 days ago

If you'd like to see it in action, I made a CodePen to demonstrate antirez's pixel dissolve over a Castle Wolfenstein 3D scene: https://codepen.io/jones1618/pen/YxRVpo

gfody 3219 days ago

"just scramble the address" was the solution that immediately came to mind. feistelNet looks a little heavy though, is it really necessary? would you not get the same effect from straight xor on the x and y coords?

Natanael_L 3218 days ago

A permutation like that reduces biases in the output

jhoechtl 3218 days ago

Couldn't that algorithm equally be used to circumvent the problem of slowing down a file systems the more the available space fills up? I imagine this would be beneficial when storing but detrimental when reading data.

buzzybee 3219 days ago

FizzleFade is also found in Microprose games from the era (e.g. Railroad Tycoon, Civilization), sometimes in full-screen transitions and other times to fade in single sprites. But more relevantly to "id software history", you can find it in Origin's Space Rogue, which John Romero contributed to. A likely possibility is that he picked up the trick on this or a previous project while at Origin.

It's also possible to use a slower "arbitrary PRNG and bump" scheme that tests the VRAM for the desired values(e.g. if it were a sprite, by running the blit at that pixel address and testing) and walks forwards or backwards until an unset value is found. If the walk can be done fast enough, it'll execute at the same framerate as an LFSR cycle-length fade. It can be further supplemented with an estimation heuristic or a low-resolution map to generate unique patterns. It's just less speedy and mathematically interesting to do that.

ticklemyelmo 3219 days ago

I remember seeing it in a variety of mid-80's Commodore 64 games, where it ran at full speed and looked fantastic. I always wondered how it worked.

zimpenfish 3219 days ago

I'm 95% sure I've seen this effect on a Spectrum which would likely predate even Space Rogue - I'd guess that would be an LSFR fade because "arbitrary PRNG and bump" would be hella clunky given the Spectrum's screen layout.

0. http://jm.planetarydefenses.net/sense/refs/ref14_golomb.pdf

hyperion2010 3219 days ago

If want to know more about cool things you can do with shift registers and you've never heard of Solomon W. Golomb, check out Shift Register Sequences (intro at [0]). Most of our fundamental telecommunications is possible because he solved the mathematics involved.

simias 3219 days ago

The original GameBoy had a hardware LFSR that could be used to generate white-noise-like sounds. It was often used for "whooshing" effects and also cymbal sounds, such as in the famous Super Mario Land theme: https://www.youtube.com/watch?v=Gb33Qnbw520

WillKirkby 3219 days ago

For more technical details on the exact LFSR used, there's this: http://belogic.com/gba/channel4.shtml

khedoros1 3219 days ago

Ditto with the NES. It actually had two possible tap configurations with different sequence lengths, so that they have slightly different sounds to them.

dmichulke 3219 days ago

A pseudo-RNG that cycles through a all elements of a modulo-ring.

Example for a 2^32 bit cycle:

X(n+1) = (a * X(n) + c) mod m

a = 134775813

c = 1

m = 2^32

leni536 3219 days ago

My approach would be something like this, but with a very "poor" generator with the parameters a=81007, c=0 and m=2^17. This approximates a low discrepancy sequence (additive recurrence with alpha=1/golden ratio). Then I would calculate x and y values using the hilbert curve and the calculated pseudorandom number as the index (more precisely two Hilbert curves next to each other, so it covers a 512x256 rectangle). On today's CPUs it can be calculated quite fast (shameless selfplug: https://github.com/leni536/fast_hilbert_curve). I suspect that the resulting pattern on the screen would be less random looking, but more uniform without any obvious pattern.

schindlabua 3219 days ago

Cool! I was just looking up hilbert implementations yesterday, so that's super useful. Thanks!

(quick note: in your source the function is called hilebert instead of hilbert)

leni536 3219 days ago

Fixed, I also added a Wolfram Mathematica module that can be used with LibraryFunctionLoad.

https://stackoverflow.com/questions/464476/generating-shuffl...

barrkel 3219 days ago

Indeed; and careful selection of parameters for the LCG can truncate the ring to most arbitrary powers of two. And if you're willing to live with slight inefficiency (no more than twice as much work), an arbitrary modulo ring (shuffled sequence) can be produced by creating slightly larger range and skipping values that are outside the range.

This is a question I asked on SO some years ago relating to this problem - producing a shuffled range of numbers without allocating an array:

xigency 3219 days ago

I thought of this too, but it might be a bit more taxing on an old CPU to do excess multiplication than a simple XOR and test.

wott 3219 days ago

> asm mov ax ,[ WORD PTR rndval ]

> asm mov dx ,[ WORD PTR rndval +2]

> asm mov bx , ax

> asm dec bl

> asm mov [ BYTE PTR y ], bl // low 8 bits - 1 = y

> asm mov bx , ax

> asm mov cx , dx

> asm mov [ BYTE PTR x ], ah // next 9 bits = x

> asm mov [ BYTE PTR x +1] , dl

I don't understand the need for the second asm mov bx , ax : BX is not used afterwards. Same for CX, it is never used.

> uint32_t rndval = 1;

> uint16_t x,y;

> do

> {

> y = rndval & 0x00F; // Y = low 8 bits

> x = rndval & 0x1F0; // X = High 9 bits

Er... no, if you do that, you only get the lowest 4 bits in y, and then you only get 5 bits in x (and not the right ones, of course).

It should be:

       y = rndval & 0x000000FF;  // Y = low 8 bits

And then you have a 'problem' for x, because you must shift it right, otherwise it doesn't fit in a 16-bit variable:

       x = rndval & 0x0001FF00;  // X = bits 8 to... 16 > 15, irk

So you should just do :

       x = rndval >> 8;  // X = bits 8 to 17, in their right place

nik_0_0 3218 days ago

Looks like the author fixed the C translation following your comments:

  y =  rndval & 0x000FF;  /* Y = low 8 bits */
  x = (rndval & 0x1FF00) >> 8;  /* X = High 9 bits */

However, given that the assembly is verbatim from id-software's git, I guess those extra instructions are part of history now.

DecoPerson 3219 days ago

I used LFSR to render the red grading in this little experiment: https://youtu.be/fUpUrpHLUxo .

I have a feeling the Octane rendering engine uses it too.

It's a good fit for most cases where you want random sampling of a set without replacement.

leni536 3219 days ago

> Since 320x200=64000, it could have been implemented with a 16 bits Maximum-length register.

But then you have to calculate modulus for 200 or 320.

tripzilch 3218 days ago

except the pixels are in a linear framebuffer, the fizzlePixel(x, y) function has a multiplication by 320 in it to calculate the address in the buffer.

so you could skip both the modulo and the mul if you go straight for obtaining a random shuffled framebuffer index.

unless maybe fizzlePixel does additional bookkeeping for some purpose.

https://github.com/id-Software/wolf3d/blob/05167784ef009d0d0...

leni536 3218 days ago

> has a multiplication by 320

Huh the original implementation seems to have a lookup table instead of that, interesting.

steventhedev 3220 days ago

It never ceases to amaze me how many of the older games were implemented as circuits first, and then translated to code. Makes you really appreciate how far we've come, and what sort of background that generation of developers had.

dre85 3219 days ago

I didn't quite understand why this is guaranteed to reach every pixel coordinate? Is there something inherent about LFSR that generates complete sequences within the cycle? So elements are never repeated or omitted?

fpgaminer 3219 days ago

Yeah this wasn't covered well in the article.

Go back to the article and look at the section just below the first mention of Maximum-Length LFSR.

Take a look at that list of numbers and notice something; every number from 1-15 is output once and only once.

That's a property of Maximum-Length LFSRs; they output each number in their range once and only once.

So, for example, a 17-bit Maximum-Length LFSR will output every number from 1-131071, just in random order.

The Wolfenstein code separates the output of the LFSR into X and Y coordinates. Since the LFSR will visit every possible number exactly once, it will visit every possible combination of X and Y coordinates exactly once.

You can look at the 4-bit Maximum-Length LFSR again. Split each number it outputs into 2-bit X and Y:

You'll see that it hits every point on a 4x4 screen exactly once, in random order.

The caveat is that it doesn't seem to hit 0,0. This is because an LFSR can't go to 0, otherwise it gets stuck there. However, I believe the ASM code was incorrectly translated by the article author. For example the author seemed to forget to translate the "dec bl" instruction into the C equivalent, which would subtract 1 from the y coordinate and allow visiting 0,0.

dre85 3219 days ago

Thanks for that explanation! I just wanted to confirm that it's an inherent property of the LFSR.

andars 3219 days ago

The cycle length is 2^17-1 = 131071. The number of pixels is 320*200 = 64000. Therefore each pixel is hit at least once.

The authors likely used 17 bits instead of 16 because then the x and y coordinates can be obtained via masking rather than modulo (8 bits -> 256 values < 320 pixels).

jstapels 3219 days ago

I was wondering the same thing myself. I'm guessing they tested it and found that it hit every coordinate at least once and said "good enough". Meaning, I don't think it's guaranteed, they just picked one that did.

The interesting thing is that you're guaranteed to always have a different number because otherwise the cycle would restart. So you just need to find a sequence that's long enough.

pasta 3219 days ago

I have the feeling that knowledge about bits is lacking by a lot of younger coders. And I also think this is what causes bloatware.

CPUs are powerful enough to use a naive fade transition. But coders who are aware of the internal workings can make it even faster on todays hardware.

Great article and imho still relevant on todays much more powerful computers.

Why? I mean if you look at the majority of work that programmers do today - frontend/backend web development and apps, there is no need to have knowledge about bits.

In fact, if I see someone using binary operators in languages such as Java,JS,Ruby etc... I'll immediately consider it bad code, regardless of context - it's just not the right tool for the level of abstraction in these languages.

The fact is that in these products (frontend/backend web development and apps) the performance profile is dominated by bad algorithms, wrong data structures, slow libraries, missing db indexes etc... which knowledge about bits help absolutely zero with.

Large binaries are also not dominated by code you write but rather by using too broad libraries or simply from huge assets.

The only thing I'd consider knowledge of bits to be of any help to a run of the mill developer these days is the knowledge that floating point numbers don't multiply/divide well - but that kind of knowledge can be imparted without really diving into how bits work.

ajuc 3219 days ago

> In fact, if I see someone using binary operators in languages such as Java,JS,Ruby etc... I'll immediately consider it bad code, regardless of context - it's just not the right tool for the level of abstraction in these languages.

Clojure is written in java (and clojure). It uses bit operations to implement Software Transactional Memory and persistent collections. Is it a bad code?

You seem to think a programmer should always code on the same level of abstraction. I'd argue in many applications often you find no suitable abstraction, and have to implement it yourself.

Also linear feedback shift registers can be used without any knowledge about bits other than the fact that integers loop over when they reach maximum value (and every programmer should know that anyways). And these registers are useful for some very nice algorithms (like in-place pseudorandom permutations' generators - for example for shuffling songs in a media player, or for some ai algorithms).

Can_Not 3219 days ago

> It uses bit operations to implement Software Transactional Memory and persistent collections. Is it a bad code?

I don't know anything about clojure, but it sounds like you're clobbering in the suggestion that because bitwise operations are bad in high level business code, they must also be bad in low level language implemention code. At best you're not paying attention, at worst that's an intellectually dishonest counter argument.

I don't know enough about Clojure, but I think you are saying that Clojure itself is using bit operations behind the hood - of course it's ok for a language implementation to use bit operations.

But if any of your Java code for applications or libraries used bit operations, I'll call it bad code.

ajuc 3219 days ago

But clojure is a java application, and its libraries (written in java) are also java libraries, you can use them from java (and some people do).

There's nothing wrong with writing your own library implemented using bit operations.

open_bear 3219 days ago

Conway's Game of Life can be implemented naively in C/C++/Java/etc, using a boolean for every cell's state (on/off). This will require at least n*m bytes (probably more in Java). Using bits to store that data will require 8 times less, which will most likely greatly increase the performance because of the data locality and the amount of data that will fit into a CPU cache.

vardump 3218 days ago

You could probably have better savings by taking advantage of the fact 99%+ of game of life state is static between iterations.

Maybe quadtree division of space etc. Only touch nodes that had no short period objects, like honeycombs, boxes or blinkers.

You wouldn't even need storage for completely empty nodes.

I'm sure best/fastest GoL engines have even more interesting strategies.

Conway's game of life is at best school homework, it does not correlate to any programming done in a real job.

tentaTherapist 3214 days ago

But if a standard library implementation is allowed to do something why can't I?

bitexploder 3219 days ago

This just seems wrong to me. There are plenty areas of programming where knowing a little bit about internal data representation, either in memory or in your data store can help you make better choices. Not everyone needs to be a system programmer, but folks still need general computer awareness. We just aren't far enough from the CPU yet. Sufficiently advanced understanding of algorithms implies computer architecture awareness to me.

I get what you are writing about here, that higher level concepts dominate programming many common apps today, but it seems like a weird place to allow yourself to have a knowledge gap.

deanCommie 3219 days ago

It shouldn't feel wrong to you. In today's connected big-data world, the vast majority of your performance latency is from querying your data (if you structure your db incorrectly), followed closely by client to server communication.

That's the p90 use case for development people are doing.

bitexploder 3219 days ago

So many simple things are only easy if you know bits. Interpreting a packet capture, poking at memory, designing cache friendly data structures, ... It is like second nature to me. If it isn't required (obviously it isn't, it must at least be a strong competitive advantage). I am not that old. I don't have get off my lawn moments. I grew up (really learned at least) on a processor (P133) where bits started mattering less, but they still mattered.

Never needed to interpret a packet, I intercept communications at the 7th layer to troubleshoot things - namely http connections.

I don't poke at memory, I use a profiler that tells me what every piece of code and variables I use take in memory.

I don't design cache friendly data structures, I use redis.

I would wager that my experience is closer to the development situation and needs of the majority of programmers.

I don't see why knowledge of bit operations are necessary to do a good programming job.

pennaMan 3219 days ago

What are some good resources to learn about bits?

lorenzk 3219 days ago

What's a "p90 use case"?

bitexploder 3219 days ago

90th percentile use case

theon144 3219 days ago

>the majority of work that programmers do today - frontend/backend web development and apps

Therein lies your bias, and the rest of your comment just follows.

I'm definitely bias towards frontend/backend web development and apps (I.E. mobile/desktop apps).

But that's not to say the premise is bad - I don't think it's wrong to assume the majority of work is in that area, definitely for young developers. There aren't that many real-time and performance-demanding programming jobs compared to web development.

I don't think even game-programming uses bit operations heavily these days- most of the low-level work is done by third party engines.

kutkloon7 3219 days ago

I remember a recent example. What I wrote was:

    int x = somefunction();
    int x_dividedby16 = x >> 4;

My coworker corrected the second line to something like:

    int x_dividedby16 = (int)Math.ceil(x / 16.0);

seanwilson 3219 days ago

I'd only ever use a bitshift if the intention of the code was to move the bits left or right (e.g. moving red, green or blue pixel component bits into the right position) or if it was absolutely required for speed as it's less readable if your intention is to divide a number. C compilers will easily optimise simple integer divisions like this for example.

biesnecker 3219 days ago

Did your coworker correct it with "this is more readable" or "this is the correct way to do it"? The first is arguable (though I don't personally agree with the argument), the second just betrays a real lack of knowledge.

sammoth 3219 days ago

Don't the two bits of code have completely different behaviour? And the code with the bitshift is undefined on negative integers (in C). So the bottom code could indeed be the "correct way to do it".

biesnecker 3219 days ago

Fair, with negative ints they're different. They have the same behavior on positive ints, though.

krzat 3219 days ago

Why not `x / 16`?

sheldor 3219 days ago

Integer division will give a rounded result instead of floor

sp332 3219 days ago

I think they're going for the ceiling here, not the floor. Truncation of positive values should be the same as a floor function.

jskkk 3219 days ago

Not in C..?

thechao 3219 days ago

   (x + 15) / 16

jskkk 3219 days ago

Yes, which is more readable and would be compiled to the same thing

throwaway2016a 3219 days ago

Just the other day I showed one of our junior devs how they could turn their 10 lines of somewhat slow and obfuscated code into a 3 arguably more readable lines using a bitwise operations. And this was a front-end webapp.

Then again, more readable is subjective and some devs might see "<<" and get thrown off.

onion2k 3219 days ago

A reasonable 'compromise' of 3 lines of code with 7 lines of comment to explain what it's doing would be my preference.

throwaway2016a 3219 days ago

Or... and just throwing this out there... developers can take responsibility to know what bitwise operators do the same as they know what "+" does.

The three lines of code were:

for ( ... ) { x = x << y }

The developer was essential using an array of booleans to simulate a bitwise operator.

It is very sad that devs these days need a comment to know what that does. Would we expect 7 lines of comments if it was:

for ( ... ) { x = x + y; }

onion2k 3219 days ago

developers can take responsibility to know what bitwise operators do the same as they know what "+" does

Yes but they won't. Pragmatically speaking adding a comment to explain a concept the next developer won't use very often means you won't need to answer their questions about it when they next need to modify that piece of code. That's a big win for you, and it means the other developer can carry on being productive.

viraptor 3219 days ago

> using binary operators [...] bad code, regardless of context

> the performance profile is dominated by bad algorithms, wrong data structures

Do you know how you often implement good algorithms and data structures? You wouldn't implement a bloom filter as a high level array of 1/0 integers, would you?

I would definitely implement a bloom filter with an array of booleans, and will not use bit operations to modify the cells.

Whether I use array access or a bitmask will mean nothing performance-wise, since the bloom filter itself is likely already speeds up a different algorithm by an order of magnitude.

For most operational purposes there isn't a crying need for optimizations and handwritten assembly outside the kernel but a good programmer must at least recognize where something like this is called for and be able to implement (imo).

laydn 3219 days ago

I agree. Unfortunately, it feels like there is so many people out there lacking the fundamental engineering background to write software.

I once spent an hour "optimizing" a piece of code that ran in 15 minutes, which tested an embedded system. The resulting code ran in 4 seconds. The root cause was a complete misunderstanding of how microprocessor interrupts work.

I spent a week re-writing a 4000 line C-function, which ended up to be around 50 lines of code, doing the same thing. The root cause was a complete misunderstanding of how state machines should be constructed.

These types of rookie mistakes by itself don't worry me. But the "trend" seems to be that young coders graduate with less knowledge about the underlying hardware on which their software will run.

I feel like just like Mathematics and Physics, low level programming / microprocessors / computer architecture classes should be part of many university curriculum. (Oh and I think they should teach about version control systems in college as well, but that's the topic of another story!)

LeifCarrotson 3219 days ago

> lacking the fundamental engineering background to write software.... young coders graduate with less knowledge about the underlying hardware on which their software will run.

There's a lot of software being written in Excel macros and Wordpress web development. These tools actually get work done - they automate things that were not automated before, they publish things that were not published before, and they generate revenue and improve productivity.

Young coders, and the authors of these programs, should not be forced or expected to go through low-level programming classes to get work done. The tools are good enough and the hardware is fast enough that for 99.99% of all projects, they don't need this expertise.

I say this as someone with an EE degree (from less than a decade ago) consisting of classes that spanned the entire pyramid of complexity: from analysis and simulation of individual transistors, through construction of a CPU on an FPGA, through writing an RTOS for a microcontroller, up to programming a web server. That understanding benefits me greatly in what I'm doing today.

But I work with plenty of people who don't have that background. For all the business logic that seems to be found in any software project, that's just fine. Most of the work is in translating people's desire for the software to "do what I want" into actual requirements and then codifying those requirements into a database or UI. For construction of state machines, they should use a library or have someone more competent construct the framework and let them implement the actual states, and when the work requires an understanding of how microprocessor interrupts, they should defer to someone who understands that technology.

And yes, they should teach some version control in college as well. It would be enough to give students one major project that ends up with the filename laydn-Project-complete-final-actuallyfinal-3-done-8-29.zip, which just worked a minute ago after which you barely changed anything and now it's completely broken, and that thing that worked once stopped working at some unknown time in the past. Then just mention that git and svn are things that exist.

nur0n 3219 days ago

I agree with you for the most part. Writing poor code is fine if the code will only be run for a limited use case. But much of the power of software comes from its potential to be easily reusable and extendable. This potential is thrown away when people write poor code. At some point, it becomes detrimental to keep reusing poor code and there is no choice but to rewrite it. So writing poor code is productive work in some sense, but it is useless(almost always, in my experience) in constructing higher levels of abstraction.

The complete disregard for good design I have seen troubles me as a software engineer. When I've had to rewrite code, usually its not because I disagree with its design, but because there is no design to disagree with! IMO code design should be taught better (or at least taught at all...). Only then can someone make the judgement call of what parts can be poorly written and which parts require more care.

This next claim is anecdotal: the people which don't bother to learn even the basics of how actual machines work tend to be the same people which don't bother to properly design their code. Good design necessitates some understanding of how the machine works, it can not be accomplished through brute force pattern matching. I understand that code is becoming increasingly common and that some people have no choice but to interact with it. But if a significant portion of your income is generated through work in software, then I don't really think a valid excuse exists; you should do your job correctly. I'm not saying everyone needs to be an expert. The culture of apathy for the most basic computing principles is what troubles me, and it should trouble anyone who has to read code written by others.

Retr0spectrum 3219 days ago

I was taught about low-level architecture (and version control too) in the first year of my CS course. Is this not standard?

lifthrasiir 3219 days ago

You should have said "thinking outside of the abstraction" instead. Bits are implementation details. The very knowledge of linear-feedback shift register is essential in this particular example.

> CPUs are powerful enough to use a naive fade transition. But coders who are aware of the internal workings can make it even faster on todays hardware.

Not really. No one would do the fade transition---or more accurately, the regional alpha blending with a fixed color---in CPU nowadays exactly because GPUs are much more capable in that job, and such a job is actually fairly easy in most abstractions used because it is an integral part of UX and whatever. The ability of leveraging all resources and determining their efficient utilization is a key.

peteretep 3219 days ago

    > I have the feeling that knowledge
    > about bits is lacking by a lot of
    > younger coders. And I also think
    > this is what causes bloatware

From a management perspective, I wonder if people find older developers miss obvious solutions that involve throwing small amounts of money and/or hardware at business problems, and instead turn to "clever" solutions that are costly in terms of extra developer time needed for developer and maintenance.

Certainly when running technology in an SME I found my gut feelings about many cost/benefit questions were often invalidated by the ever-decreasing cost of computing power, both in terms of physical hardware and cloud resources.

notzorbo3 3219 days ago

Managers often think throwing more hardware at the problem will fix things. But hardware scales sub-linearly. Solving the actual underlying problems can easily get you exponential payoffs.

Anecdote: One of my managers once spent €70000¹ worth of hardware to speed up an application because he believed it would be cheaper than to optimize the code. Naturally, no-one had done any kind of performance analysis, so while the upgrade made things about 20% faster, it still wasn't enough. It took me all of 20 minutes to figure out that caching was disabled on the database and it had no indexes whatsoever². 30 minutes of work yielded a few thousand percent speed increase.

¹) This was a long time ago before SSDs existed and buying a hardware RAID of fast disks and fast CPUs / memory was extremely expensive.

²) DBAer consultants of a certain large database vendor are not worth the horrendous hourly fee they ask.

peteretep 3219 days ago

    > Managers often think throwing more
    > hardware at the problem will fix things

It often will...

    > no-one had done any kind of performance
    > analysis

... is the moral of this story

nur0n 3219 days ago

I think we agree on the basic point: you can't decide what the most cost effective solution is until analysis is carried out. But an effective performance analysis can't be executed without an understanding of the machine.

Lacking low level knowledge can only create blind spots in the analysis.

chillingeffect 3219 days ago

Nice work! This is where legends of 10x programmers come from.... they're 2x programmers in a world of 0.2x programmers ;)

laydn 3219 days ago

That almost physically hurts to read!

The performance modelling teams working for the large CPU makers get several code traces from major database vendors to optimize the execution of critical pieces of database code. They do a cycle-by-cycle analysis.

And then people run the databases with caching and indexing disabled. Go figure

xorcist 3219 days ago

> I wonder if people find older developers miss obvious solutions that involve throwing small amounts of money and/or hardware at business problems,

I regularly see the exact opposite: People throwing money and hardware at problems instead of doing the simple and obvious thing. I guess that makes me officially old.

peteretep 3219 days ago

    > People throwing money and hardware at
    > problems instead of doing the simple
    > and obvious thing

There are (almost) no technical problems, just business decisions. Programmer time for development and maintenance is expensive. Hardware is decreasingly so (also it's CapEx, so nobody cares if you literally light it on fire).

wmccullough 3219 days ago

I find this too. Even the so-called proprietary algorithms many companies tout aren't even that complicated because 90% of their business runs on a REST over CRUD in SQL web service with a slightly different bootstrap theme on the frontend.

xorcist 3217 days ago

See, now you fell for the same fallancy. Something that is smart must surely be more expensive, but this is seldom true. A method that is simple is easier to describe, easier to implement and could very well be more performant than a more complicated one.

eru 3219 days ago

You can often see a bit bump in complexity when going from single computer to some kind of cluster environment (be that hadoop or similar). That complexity also often decreases efficiency.

So it can be useful to think a bit to be able to stay on one machine.

But making those trade-offs is exactly what engineering is about, and what's hard. So talking in the very abstract is seldom useful.

It depends on the person, obviously. But, I hope you sprinkle in a few of those types of analysis questions in your standardized interviews so that older people like me even have a chance.

It hurts (both emotionally and financially) to be rejected by younger teams because they assume things about me that aren't true. I now get the dreaded "has too much experience and would be bored" rejection line all too often.

There is only one way to go there. Start a consultancy and be your own boss. That is about it for us older folks. Most of your customers don't want stellar code: they want you to fix what is broken (which is simple enough).

seanwilson 3219 days ago

> From a management perspective, I wonder if people find older developers miss obvious solutions that involve throwing small amounts of money and/or hardware at business problems, and instead turn to "clever" solutions that are costly in terms of extra developer time needed for developer and maintenance.

Why do people feel the need to generalise about things developers of different ages do...?

Knowing when to save on developer time by spending money in some form definitely seems to be a learned skill though.

peteretep 3217 days ago

    > Why do people feel the need to
    > generalise about things developers
    > of different ages do

Mostly it was a tongue-in-cheek reply to a comment doing the same...

mighty_bander 3219 days ago

You'll find a lot of people of all ages that get lost in the problem instead of considering the cost/benefit and going with a good compromise between cheap and "perfect."

Of course you'll also find a lot of kids who overestimate the maintenance cost of 3 lines of weird code sitting in a corner doing their thing for 20 years ;)

This is where some knowledge as a syseng or SA comes in very handy. Programmers tend to come up with cool solutions to a problem in their wheelhouse but it may not be the most cost effective or efficient solution. Having domain knowledge is very important.

dakom 3219 days ago

Anecdotal evidence - I cut my teeth on C and a dabble of asm as a kid (even writing z80 opcodes directly for fun!). I vaguely remember wrangling bits to drive logic. I also wrote webpages as Apache C modules with binary logic to parse requests :D

That was a long time ago in tech terms (~15 years). In the intervening gap I spent most of my time in scripted languages (as3, c#, etc.).

Result? I don't know how to wrangle bits anymore. The added speed increase I'd add almost definitely isn't worth it (for example, flash and unity games, web services, etc.). That doesn't mean it isn't worth it for the end product - I rely on Unity, Adobe, Google, etc. to worry about those optimizations so I can stand on their shoulders and benefit.

But in terms of cost/benefit, getting into that stuff myself (again) just didn't make sense for most of my professional needs.

ksk 3219 days ago

>I rely on Unity, Adobe, Google, etc. to worry about those optimizations so I can stand on their shoulders and benefit.

It seems to me like they might be doing the same thing! For all the supposedly smart people they hire, TBH I can't think of a single product that feels fast/performant to me.

dakom 3218 days ago

Ultimately that's a subjective call. If I had to write the Unity Editor and enable it to hit the targets they do - man I don't even know where to start. Seems like voodoo to me, personally I'm extremely impressed but I admit it might just be my lack of knowledge of what's possible now (last time I did low-level graphics stuff was amateur mucking around with mode 13h and SDL)

Recently, I started helping out data engineering teams improve performance of their big data processing pipelines.

Man, was I shocked. Very smart, highly educated, mid-level and even senior software engineers seem to know very little about bits these days. When they'd run into a memory issue, their natural response was to just spin up a few more servers and throw another terabyte of memory at the problem. Makes sense, I guess - until their CFO saw their pretty exponential curve in infrastructure costs.

hasenj 3219 days ago

It's lucrative for SaaS providers to convince developers that they don't need to care about performance. Just throw more memory at your problem. We'll be there to provide you all the hardware you need. wink wink.

And did knowledge of bits help solve the memory issue?

MichaelGG 3219 days ago

I've done stuff where memory usage shrank from 30-60GB to under 2GB, and load times went from many minutes to essentially instant (mmap on the bits Vs needing to build objects).

That was implicit. Of course it did! :-) And, cut server costs in half, too.

auggierose 3219 days ago

I like bits, too, but cutting it in half means it is still exponential ;-)

To be fair, the savings are way above 50%, honestly. But, I didn't want to claim a bigger number without digging in more, and 50% was a very conservative upper bound. But, yeah, you're right.

keithnz 3219 days ago

there's still the whole other world of coders doing "embedded" software. They are still happily bit fiddling in blissful ignorance of web development fads. I only do a little bit these days, but one of the micros I program only has 20bytes of RAM to work with and requires bit magic. :)

digi_owl 3219 days ago

Sadly many these days conflate "embedded" with RPi, Beaglebone or some other SoC on a board...

eru 3219 days ago

Even the chips you find on your credit card have quite a lot of RAM these days, don't they?

spatulon 3219 days ago

In the automotive world (with the exception of infotainment and assisted/autonomous driving systems), ECUs have amounts of RAM measured in kilobytes, not megabytes. I've left the industry now, but I /think/ the powertrain ECUs I worked with usually had around 256KB of RAM.

That is a fascinating world to me. I've about 25 years C, rudimentary intel x86 asm and good ability to understand new hardware. Seems to me that at 40+ this is an industry that I could still be of use in. Any pointers?

i_cant_speel 3219 days ago

I think that working with bits is still a standard part of a Computer Science program. I took a class last year dealing with logic gates and bits. I also have a couple of friends that went to other schools that also took the same kind of class.

While the information is nice to know and helps every once in a while (like understanding these articles better), I just don't think it's nearly as necessary to know as it used to be.

throwaway2016a 3219 days ago

I find it incredibly hard to believe that any young code with a freshman level Computer Science education would now know bits inside and out. Then again, I went to college 10 years ago and I went to one of the better schools so maybe they don't teach that anymore?

I also acknowledge that with the invention of "code bootcamps" a lot more coders have never been to college than before. Effectively, I think, creating two types of programer that has nothing to do with age. Both types have their applications they are good at.

seanwilson 3219 days ago

> I have the feeling that knowledge about bits is lacking by a lot of younger coders. And I also think this is what causes bloatware.

I don't think this is true. I used to code in assembly and knowledge of bits is largely irrelevant the vast majority of the time now outside of graphics algorithms and low-level number crunching. Algorithmic growth and memory usage are always going to be important though.

wmu 3219 days ago

I'm a low-level programmer and it's the first time I come across the linear-feedback shift register. I would say it's rather a part of digital circuits design.

Now, when FPGAs become more and more popular, people will be forced to learn VHDL languages and then different lesser-known techniques will surface. (It's a naive prediction, I know.)

thesimon 3219 days ago

>Now, when FPGAs become more and more popular, people will be forced to learn VHDL languages and then different lesser-known techniques will surface.

Isn't that already part of a normal CS degree? We needed to learn VHDL to program a Atmel ATTiny for class.

maffydub 3219 days ago

Are you sure you learnt VHDL? As I understood it, the ATTiny isn't an FPGA but an 8-bit microcontroller.

Learning some form of assembly language at university is (probably) reasonably common.

Learning VHDL (for programming FPGAs) is less common - I did for my Computer Science degree, but this was about 15 years ago, so I don't know if it's still common.

ctw 3219 days ago

Just one data point, but I'm currently at school for Computer Engineering and we learn and use VHDL and Verilog for several courses. We use it to program Altera FPGA's.

maffydub 3219 days ago

Glad to hear it's still being taught! I was using Altera FPGAs too, although I suspect they're quite a lot higher gate count now. (My second year project was an FPGA implementation of the EDSAC computer from the 1940s - https://en.wikipedia.org/wiki/Electronic_delay_storage_autom...)

mysterydip 3219 days ago

Agreed. There's plenty of applications even today where this kind of mentality is helpful (embedded or real-time systems, for example).

It's not even necessarily about squeezing cpu or memory performance, but finding a solution that is "native" to how computers work: binary bits.

oblio 3219 days ago

> I have the feeling that knowledge about bits is lacking by a lot of younger coders.

I optimize for what they pay me for now. I also optimize for what I think they will pay me in future.

The super efficient C solution might be an salary dead-end when management also accepts a huge Hadoop solution :)

smnscu 3219 days ago

Agreed. I like this list of somewhat common bit hacks (the popular Bit Twiddling Guide): graphics.stanford.edu/~seander/bithacks.html

maccard 3219 days ago

I dont particularly care for your generalisation about younger coders, as a younger coder. There are many of us who do care about the low level details of our code, and take extra care to write good performant code. To make a generalisation about "older" programmers - I have the feelings that older coders are stuck in their ways and aren't willing to change their behaviours, and when a younger coder tries to suggest improvements, they're hand waved away because we have less experience.

pasta 3219 days ago

Well I was not trying to blame. And I 100% agree there are loads of good coders of all ages.

But knowledge about inner workings helps to understand what is going on and what can be improved.

So I don't say you should use a non GC language for example. But knowledge about memory management can help to understand memory leaks.

https://en.wikipedia.org/wiki/Linear-feedback_shift_register...

ishi 3220 days ago

Cool, I knew that LFSRs were used in ciphers. I was not aware that they were also useful for implementing old-school graphical effects.

[1]: https://en.wikipedia.org/wiki/Content_Scramble_System

ktta 3219 days ago

CSS[1] was the defacto 'DRM' back in the day, which used two LFSRs for 40-bit encryption. Apart from being able to brute force it today, there are many other ways to break it[2].

Now the DRM on DVDs/BluRays is AACS[3], which uses AES. You might also recognise it from the 'copyrighted numbers' fiasco[4]

[2]: https://en.wikipedia.org/wiki/Content_Scramble_System#Crypta...

[3]: https://en.wikipedia.org/wiki/Advanced_Access_Content_System

[4]: https://en.wikipedia.org/wiki/AACS_encryption_key_controvers...

jimktrains2 3219 days ago

Similar concepts are used for raid 6 parity (LSFR + Galois Fields). A pdf I found describing raid 6 parity describes the field and LSFR use and some background. https://www.kernel.org/pub/linux/kernel/people/hpa/raid6.pdf

(I took at stab at implementing them awhile back as an experiment. https://github.com/jimktrains/r6parity)

zimpenfish 3219 days ago

I'm not sure I'd call Wolfenstein 3D "old-school".

But then I did start with computers in 1980.

smcl 3219 days ago

It's funny where we draw our lines. Some younger guys will call Half-Life 2 old-skool, which for me felt like it was just a couple of years ago.

kbart 3219 days ago

Most of recent CS graduates weren't even born when this game was released, so it's not even "old-school" -- it's "ancient" for them.

1: https://github.com/EmmanuelOga/lfsr 2: https://en.wikipedia.org/wiki/Linear-feedback_shift_register

emmanueloga_ 3218 days ago

As interesting as this article is, I would also love to know where and how the author of the code learned about LFSRs!

I wonder if they rediscovered the algorithm, knew they were implementing a LFSR or even just solved a particular instance of the problem without ever realizing they were writing a LFSR.

I learned about LFSRs a while ago and wrote a small implementation for ruby as an exercise [1] using a Wikipedia page as reference [2]. But Wolfenstein 3D was released in 1992, I'm sure back then information was a lot harder to find online!

thomasahle 3219 days ago

Fundamentally you want to make a random permutation, but not spend linear memory on it, as you would if you did it by shuffling.

nemo1618 3219 days ago

I did something like this in Go: https://godoc.org/github.com/lukechampine/randmap/perm

jstimpfle 3219 days ago

What properties are required in an LFSR that it covers the whole range (2^n-1 numbers) before returning? Or are such configurations found experimentally?

pfedak 3219 days ago

Galois generators of the sort in the article can be described naturally using the finite field Z_2[x]/p(x), that is, polynomials with coefficients taken mod two, where we consider polynomial p(x) is equivalent to zero. If p has degree n, this field has 2^n elements - the 2^n polynomials with degree less than n are distinct, but x^n is equivalent to x^n-p(x), which has degree less than n, so everything with higher degree is already accounted for. This is, however, only actually a field if p is irreducible.

Now we can describe the generator as multiplication by x, where the leftmost bit is the lowest power, since every bit moves right except for the rightmost, which corresponds to turning x^n into x^n-p(x) as above. The cycle length is the smallest k such that x^k is equivalent to 1. Now, an interesting property of finite fields is that there is always a "primitive" element, the powers of which generate every other element (you can prove this inductively by counting elements z such that z^d = 1 for different d, and noting that z^d-1, as a degree d polynomial, has at most d roots in the field). If x is primitive, it will cycle through all other values before reaching 1. In the specific case of n=17, however every element of the field is primitive (this is easy group theory, the nonzero elements form a group under multiplication, and that group has a prime number of elements).

This means any degree-17 polynomial which is irreducible in Z_2[x] will give rise to a full-cycle-length generator. Luckily, finding an irreducible polynomial isn't too hard - of the 2^16 options (ignoring the ones without a constant term, which are obviously divisible by x), 7710 are irreducible by my quick Mathematica computation.

jimktrains2 3219 days ago

A pdf I found describing raid 6 parity describes it. https://www.kernel.org/pub/linux/kernel/people/hpa/raid6.pdf That PDF + the sibling comment should help in understanding.

(I took a stab at implementing it in C a few years back. https://github.com/jimktrains/r6parity)

vigna 3219 days ago

There's a complete theory based on primitive polynomials on Z/2Z. Check any article on linear PRNGs for references.

"Experimentally" will work only with ridiculously small state spaces.

eru 3219 days ago

I think some number theory should give you the answer. Look for something about cyclic groups and their generators..

Jyaif 3219 days ago

Note that you can (quite obviously) use this to fade from one image to an other by writing the color of the next image instead of just "red".

transitive_bs 3219 days ago

For a Javascript implementation, check out https://github.com/fisch0920/dissolve-generator

This version is based off of the article A Digital Dissolve Effect by Mike Morton "Graphics Gems", Academic Press, 1990 (http://dl.acm.org/citation.cfm?id=90821)

D_Guidi 3219 days ago

as a "senior" business programmer with non-engineering studies (I have a deegree in byology), I'm feeling an impostor reading this and admitting that I'm unable to understand basically everything... even https://bigmachine.io/products/the-imposters-handbook/ not helped too much

cousin_it 3219 days ago

It's very simple. Basically you need to cycle through all n-bit integers (except zero) in a random-looking way, and it turns out there's an arcane-ish bit twiddling operation that will do it when applied repeatedly. The reason it works can be explained with some undergrad math. You don't need to know these things for business programming but they are fun to read about.

Another idea in a similar vein is https://en.wikipedia.org/wiki/Floyd%E2%80%93Steinberg_dither... It converts an image with many shades of gray to an image with only black and white pixels. The algorithm is dead simple, but the results are surprisingly convincing.

kbart 3219 days ago

Don't worry, it's rare to see manual bit level optimization these days as compilers are quite smart in optimization. It's kind of lost art. I work with embedded systems and even there bit manipulation is mostly used for controlling MCU registers, not writing optimized code. If you still want to understand such manipulations, it's mostly boolean algebra of which there's plenty of literature to choose from.

sp332 3219 days ago

The article skipped most of the description of what LSFR's actually do, and the linked Wikipedia article is surprisingly unhelpful. After you read the value of the registers, they all get shifted one bit to the right. The last value simply falls off and is discarded. To generate the value for the new leftmost bit, which is now "empty", you XOR some of the other bits together. Then you read the new value and start over. Eventually the values will repeat, and the goal is to find a configuration that will give you the longest cycle before repeating.

dopeboy 3220 days ago

Fascinating. Who wrote the original ASM? Carmack?

userbinator 3219 days ago

My assumption is that LFSR literature was hard to come across in 1991/1992 and finding the correct tap for a 16 bit maximum length register was not worth the effort.

I guess it might be more due to the lack of overlap between the problem domains --- LFSRs were known since the late 60s in relation to CRCs and other error-correcting codes. https://en.wikipedia.org/wiki/Gold_code

abecedarius 3219 days ago

Still a good trick today. https://github.com/silentbicycle/greatest just added a feature of shuffling test executions into random order, after I suggested the general idea in a Twitter convo. (I think it uses LCGs instead of an LFSR.)

Kiro 3219 days ago

Stupid question: Why can't you just take an array of all the pixels, randomize it and then iterate it to draw?

gall_anonim 3219 days ago

Because with ideal packing it would take 150 kB (320·240·log2(320·240)/8). Wolf3d needed 640 kB of RAM and using quarter of it just to display death animation wouldn't be a good idea.

bombela 3219 days ago

because this would take a huge amount of memory. something like xy(sizeof x + sizeof y). not only memory wasn't available cheaply at the time. it would be slow to generate, write to memory, then fetch back to draw.

bkanber 3219 days ago

This was written in assembly and to do that you'd have to make a new register of all the references while randomizing. Your approach works great in modern, high level languages where there's no issue copying an array in memory and calling array_shuffle or something, and doubling the memory requirement for the effect is no problem. This approach however seems to only require 17 bits of additional memory.

et1337 3219 days ago

1) That's a lot of memory

2) Accessing memory is the slowest thing you can possibly do on a computer, other than IO

drudru11 3219 days ago

The Atari 2600 video hardware used these all over the place instead of traditional counters. It saved them a lot of gates that counters would require.

trollopTheJope 3219 days ago

i am interested to know the particulars of any routines people have for reading and reviewing a codebase, as the author talks about doing in his spare time. do you take notes? add comments? step through with a debugger?

kbart 3219 days ago

Given you are a seasoned programmer, most of the code written in familiar language should be obvious just by skimming it. But when it comes to a short and "smart" algorithms, especially including bit manipulations, I still find pen&paper the best tool to find out what's really happening.

thejynxed 3219 days ago

Pencil/pen, paper, flow charts, and in some cases one of those TI scientific calculators can come in real handy.

This being said, I've never gone much away from C/C++/Pascal/Assembly/COBOL/FORTRAN and various forms of BASIC where what I mentioned helps immensely (I work on legacy systems in my spare time, CS isn't my primary field), although Python, Haskell, Rust and GO have piqued my curiosity.