Hacker News new | ask | show | jobs
by Illniyar 3223 days ago
Why? I mean if you look at the majority of work that programmers do today - frontend/backend web development and apps, there is no need to have knowledge about bits.

In fact, if I see someone using binary operators in languages such as Java,JS,Ruby etc... I'll immediately consider it bad code, regardless of context - it's just not the right tool for the level of abstraction in these languages.

The fact is that in these products (frontend/backend web development and apps) the performance profile is dominated by bad algorithms, wrong data structures, slow libraries, missing db indexes etc... which knowledge about bits help absolutely zero with.

Large binaries are also not dominated by code you write but rather by using too broad libraries or simply from huge assets.

The only thing I'd consider knowledge of bits to be of any help to a run of the mill developer these days is the knowledge that floating point numbers don't multiply/divide well - but that kind of knowledge can be imparted without really diving into how bits work.

7 comments

> In fact, if I see someone using binary operators in languages such as Java,JS,Ruby etc... I'll immediately consider it bad code, regardless of context - it's just not the right tool for the level of abstraction in these languages.

Clojure is written in java (and clojure). It uses bit operations to implement Software Transactional Memory and persistent collections. Is it a bad code?

You seem to think a programmer should always code on the same level of abstraction. I'd argue in many applications often you find no suitable abstraction, and have to implement it yourself.

Also linear feedback shift registers can be used without any knowledge about bits other than the fact that integers loop over when they reach maximum value (and every programmer should know that anyways). And these registers are useful for some very nice algorithms (like in-place pseudorandom permutations' generators - for example for shuffling songs in a media player, or for some ai algorithms).

> It uses bit operations to implement Software Transactional Memory and persistent collections. Is it a bad code?

I don't know anything about clojure, but it sounds like you're clobbering in the suggestion that because bitwise operations are bad in high level business code, they must also be bad in low level language implemention code. At best you're not paying attention, at worst that's an intellectually dishonest counter argument.

I don't know enough about Clojure, but I think you are saying that Clojure itself is using bit operations behind the hood - of course it's ok for a language implementation to use bit operations.

But if any of your Java code for applications or libraries used bit operations, I'll call it bad code.

But clojure is a java application, and its libraries (written in java) are also java libraries, you can use them from java (and some people do).

There's nothing wrong with writing your own library implemented using bit operations.

Conway's Game of Life can be implemented naively in C/C++/Java/etc, using a boolean for every cell's state (on/off). This will require at least n*m bytes (probably more in Java). Using bits to store that data will require 8 times less, which will most likely greatly increase the performance because of the data locality and the amount of data that will fit into a CPU cache.
You could probably have better savings by taking advantage of the fact 99%+ of game of life state is static between iterations.

Maybe quadtree division of space etc. Only touch nodes that had no short period objects, like honeycombs, boxes or blinkers.

You wouldn't even need storage for completely empty nodes.

I'm sure best/fastest GoL engines have even more interesting strategies.

Conway's game of life is at best school homework, it does not correlate to any programming done in a real job.
I'm frankly a bit offended by the type of computer science and computer programming that you dismiss as not a "real job".

Sure, Conway's Game of Life is a toy problem, but the optimizations used to make it run fast (which go way beyond mere bit-twiddling, btw) are useful tools and practice in a programmer's mental toolbox, for solving real problems.

Certainly, a programmer who "merely" plugs frontends into backends with databases or whatever it is that you consider a "real job" in programming[0], doesn't need to know about bit-level optimizations to do that job. But I would argue that even then, it makes a better programmer, knowing about that sort of thing, broadening your bases, versatility and well-roundedness.

Doesn't mean you have to use it on problems where it's not supposed to, or premature optimization.

Elsewhere in this thread you say you don't need low-level optimizations because Redis handles this for you. But you wouldn't say that the people researching, designing and building Redis weren't doing "real jobs", would you?

I think broadening your bases is actually a pretty good reason for learning about these (or other) things. Because you seem to have a pretty narrow view of what a "real job" is in programming. In particular that you can't seem to come up with any good reasons for knowing about bits, except IEEE floats.

What about graphics programming, usually pixels are packed into a 32-bit RGBA word, you need some basic bit operations for this. Most situations abstracting that away kills performance.

If you need to read out sensors from a device, you're probably going to come across bits (and many other ancient programming lore) somewhere along the line.

If you write code for small / low powered hardware (like Arduinos or equivalent) you are going to need to know your way around bits and bit-manipulation code is very welcome there.

Knowing about these techniques is also important because sometimes they suddenly become useful in a whole new or modern context. Take antirez's Feistel network approach to this fizzling-problem, for instance. That's a cryptographic primitive now used in a graphics context. You can make exactly the same point about it, that a programmer who is supposed to do frontend/backend/db work, should never write code like that, and indeed this is true, you most definitely should NOT be coding your own versions of crypto primitives in a context like that (I'd argue that's even worse than bit-twiddling).

[0] I'm kidding, I respect the work that goes into it. I think it's boring as hell, but it's a lot of work and not all trivial either.

Totally agree.

> (which go way beyond mere bit-twiddling, btw)

I was fascinated by optimizations people come up for Game of Life by reading Michael Abrash's Black Book.

> a programmer who "merely" plugs frontends into backends with databases"

I am such a programmer: never have used bit manipulation in my job, and while it is kinda boring, it is a job, and someone has to do it.

But if a standard library implementation is allowed to do something why can't I?
This just seems wrong to me. There are plenty areas of programming where knowing a little bit about internal data representation, either in memory or in your data store can help you make better choices. Not everyone needs to be a system programmer, but folks still need general computer awareness. We just aren't far enough from the CPU yet. Sufficiently advanced understanding of algorithms implies computer architecture awareness to me.

I get what you are writing about here, that higher level concepts dominate programming many common apps today, but it seems like a weird place to allow yourself to have a knowledge gap.

It shouldn't feel wrong to you. In today's connected big-data world, the vast majority of your performance latency is from querying your data (if you structure your db incorrectly), followed closely by client to server communication.

That's the p90 use case for development people are doing.

So many simple things are only easy if you know bits. Interpreting a packet capture, poking at memory, designing cache friendly data structures, ... It is like second nature to me. If it isn't required (obviously it isn't, it must at least be a strong competitive advantage). I am not that old. I don't have get off my lawn moments. I grew up (really learned at least) on a processor (P133) where bits started mattering less, but they still mattered.
Never needed to interpret a packet, I intercept communications at the 7th layer to troubleshoot things - namely http connections.

I don't poke at memory, I use a profiler that tells me what every piece of code and variables I use take in memory.

I don't design cache friendly data structures, I use redis.

I would wager that my experience is closer to the development situation and needs of the majority of programmers.

I don't see why knowledge of bit operations are necessary to do a good programming job.

So you're looking at a http request/response pair, from a service that you did not design, and somewhere in one of the headers you spot a string that looks like this:

YmFzZTY0IGFsd2F5cyB0ZW5kcyB0byBsb29rIGtpbmQgb2YgbGlrZSB0aGlz

Some people will immediately know what to do with that string in the blink of an eye, and the reason why they can spot that with a single glance has very much to do with basic knowledge about bit manipulations.

> I don't see why knowledge of bit operations are necessary to do a good programming job.

The programmer that can't do the above will be stuck. Probably they can still do a good programming job regardless, but the programmer that can is a better programmer.

Knowledge that is related to your practical work but not technically part of it, yet it is part of the field of your work. That is very useful because it makes you better and more well-rounded, meaning you have more mental tools to tackle unforeseen problems. It makes you more well-rounded and better at your job.

If you only know the things you strictly need to know for your job, you're going to get stuck as soon as you encounter a problem that requires novel thinking. And I dunno, I also consider that ability to be part of a "real job".

These are good points. At some point architecture and systems like Redis will become the only tools that can reliably implement these behaviors if no one ever learns about bits. Similar to how hard it can be to outperform compilers in many areas when writing assembly by hand.

Then again there are programming disciplines where you have to know all of this and more. It ultimately comes back to those capable of implementing the tools/frameworks/compilers and those that just use them. Safe crypto libraries are impossible without bit wrangling.

I will still argue that having these skills has at times saved order of magnitudes of time for me. Maybe it wasn't worth all the effort I spent if it only saves time once in a while. Then again I haven't been a full time developer in like 11 years, now I break other peoples software for a living and bang bits together on a weekly basis :)

These design|use patterns depend entirely on a level of abstraction facilitated by code you don't understand and I have seen this bite people badly.

Using redis or memcache or whatever semi-persistent , network available data store for key value pairs only has middling value (to me) and relies on adopting a data model that is severely limited in scope. If you can't understand the structure of a 'packet' on the network then you can't really troubleshoot network dependent services that provide your L7 function.

What are some good resources to learn about bits?
There are some quite good pages, I don't remember the exact titles, but the first results are quite ok: https://www.google.de/search?q=bit%20hacks
You can write mostly C code. Each processor will have different quirks.

https://en.wikipedia.org/wiki/CPU_cache

https://www.akkadia.org/drepper/cpumemory.pdf

These are great places to start.

Especially Ulrich Drepper's PDF on memory.

edit: even simple things like looping can require knowledge of "bits" and "memory layout" and "CPU cache" to write the fastest code. Most people don't need to write the fastest code though.

What's a "p90 use case"?
90th percentile use case
Ha I thought it was related to P90X somehow

https://en.wikipedia.org/wiki/P90X

>the majority of work that programmers do today - frontend/backend web development and apps

Therein lies your bias, and the rest of your comment just follows.

I'm definitely bias towards frontend/backend web development and apps (I.E. mobile/desktop apps).

But that's not to say the premise is bad - I don't think it's wrong to assume the majority of work is in that area, definitely for young developers. There aren't that many real-time and performance-demanding programming jobs compared to web development.

I don't think even game-programming uses bit operations heavily these days- most of the low-level work is done by third party engines.

I remember a recent example. What I wrote was:

    int x = somefunction();
    int x_dividedby16 = x >> 4;
My coworker corrected the second line to something like:

    int x_dividedby16 = (int)Math.ceil(x / 16.0);
I'd only ever use a bitshift if the intention of the code was to move the bits left or right (e.g. moving red, green or blue pixel component bits into the right position) or if it was absolutely required for speed as it's less readable if your intention is to divide a number. C compilers will easily optimise simple integer divisions like this for example.
Did your coworker correct it with "this is more readable" or "this is the correct way to do it"? The first is arguable (though I don't personally agree with the argument), the second just betrays a real lack of knowledge.
Don't the two bits of code have completely different behaviour? And the code with the bitshift is undefined on negative integers (in C). So the bottom code could indeed be the "correct way to do it".
Fair, with negative ints they're different. They have the same behavior on positive ints, though.
Why not `x / 16`?
Integer division will give a rounded result instead of floor
I think they're going for the ceiling here, not the floor. Truncation of positive values should be the same as a floor function.
Yes, I wrote the example. Then realized it was different and corrected only half of it.
Not in C..?

   (x + 15) / 16
Yes, which is more readable and would be compiled to the same thing
Just the other day I showed one of our junior devs how they could turn their 10 lines of somewhat slow and obfuscated code into a 3 arguably more readable lines using a bitwise operations. And this was a front-end webapp.

Then again, more readable is subjective and some devs might see "<<" and get thrown off.

A reasonable 'compromise' of 3 lines of code with 7 lines of comment to explain what it's doing would be my preference.
Or... and just throwing this out there... developers can take responsibility to know what bitwise operators do the same as they know what "+" does.

The three lines of code were:

for ( ... ) { x = x << y }

The developer was essential using an array of booleans to simulate a bitwise operator.

It is very sad that devs these days need a comment to know what that does. Would we expect 7 lines of comments if it was:

for ( ... ) { x = x + y; }

developers can take responsibility to know what bitwise operators do the same as they know what "+" does

Yes but they won't. Pragmatically speaking adding a comment to explain a concept the next developer won't use very often means you won't need to answer their questions about it when they next need to modify that piece of code. That's a big win for you, and it means the other developer can carry on being productive.

> using binary operators [...] bad code, regardless of context

> the performance profile is dominated by bad algorithms, wrong data structures

Do you know how you often implement good algorithms and data structures? You wouldn't implement a bloom filter as a high level array of 1/0 integers, would you?

I would definitely implement a bloom filter with an array of booleans, and will not use bit operations to modify the cells.

Whether I use array access or a bitmask will mean nothing performance-wise, since the bloom filter itself is likely already speeds up a different algorithm by an order of magnitude.

For most operational purposes there isn't a crying need for optimizations and handwritten assembly outside the kernel but a good programmer must at least recognize where something like this is called for and be able to implement (imo).