Hacker News new | ask | show | jobs
by josephg 814 days ago
Yeah... its shocking to me how difficult it is to read the C++ standard library. Surely, the standard library is written by the authors of the language. It should be a positive example of how they hope their language is used, right?

Here's the source of C++'s vector class:

https://gcc.gnu.org/onlinedocs/gcc-4.6.2/libstdc++/api/a0111...

In comparison, vec in rust. (Note you need to scroll down a few pages to start seeing non-trivial functions. There's a lot of block comments.):

https://doc.rust-lang.org/src/alloc/vec/mod.rs.html#398

Or list in Go:

https://cs.opensource.google/go/go/+/master:src/container/li...

To my eye, that C++ code is by far the hardest code to read.

5 comments

Notice that there are in practice three distinct implementations of the C++ standard library. They're all awful to read though, here's Microsoft's std::vector https://github.com/microsoft/STL/blob/main/stl/inc/vector

However you're being slightly unfair because Rust's Vec is just defined (opaquely) as a RawVec plus a length value, so let's link RawVec, https://doc.rust-lang.org/src/alloc/raw_vec.rs.html -- RawVec is the part responsible for the messy problem of how to actually implement the growable array type.

Still, the existence of three C++ libraries with slightly different (or sometimes hugely different) quality of implementation means good C++ code can't depend on much beyond what the ISO document promises, and yet it must guard against the nonsense inflicted by all three and by lacks of the larger language. In particular everything must use the reserved prefix so that it's not smashed inadvertently by a macro, and lots of weird C++ idioms that preserve performance by sacrificing clarity of implementation are needed, even where you'd ordinarily sacrifice to get the development throughput win of everybody know what's going on. For example you'll see a lot of "pair" types bought into existence which are there to squirrel away a ZST that in C++ can't exist, using the Empty Base Optimisation. In Rust the language has ZSTs so they can just write what they meant.

Part of the distinct "C++ std" style is due to naming rules and textual includes. Every name that's not part of the standardized interface starts with __ because __foo and _Foo are blanket reserved names, so a user can't complain that std:: explodes when he does "#define _Base 0".
> It should be a positive example of how they hope their language is used, right?

should it though? there's a million ways to learn C++. Reading the std code definitely isn't one - technically the std could be entirely compiler builtins. If you want to read positive examples take A Tour of C++ 3rd edition (https://www.amazon.ca/Tour-C-Bjarne-Stroustrup/dp/0136816487)

"Do as I say, not as I do" is known to be poor pedagogy.

If you find that expert practitioners don't do the things you think students should be doing, it suggests that something is wrong and needs fixing. In the standard library implementations it's very obvious that something is badly wrong, and yet for decades C++ has resisted the hard work of fixing it.

Yet thousands of people learned C++ well enough to use it in a professional setting, so surely << "Do as I say, not as I do" is known to be poor pedagogy. >> does not hold in the general case.

> If you find that expert practitioners don't do the things you think students should be doing, it suggests that something is wrong and needs fixing

It does not and is a very naïve world view. In any trade expert practitioners' way of working is wildly different from what you would learn in a classroom (and generally makes said student's hair raise on their head when they see. This is not too relevant with C++ though as the std implementation "ugliness" is mainly driven by material constraints.

Besides in general in programming this is not even possible. Like, from your argument one wouldn't be able to learn how to use the Win32 API or Cocoa API since the operating systems using them are closed-source and you cannot see how they are implemented & used by the teams who develop these APIs.

No, I reject the claim that "my argument" (not in fact mine) says you can't learn an API unless you can see how it's implemented. I don't think that's a remotely plausible reading of what was written. Instead I agree with the claim they actually wrote that since software is intended first and foremost to be read, it makes sense that the standard library, software you'll be using as a programmer should be a positive example and not a horrible distorted mess.
> This is not too relevant with C++ though as the std implementation "ugliness" is mainly driven by material constraints.

These "material constraints" are clearly completely artificial - since they don't show up in other languages (rust, go, swift, haskell, etc etc). For example, in C++'s std header files:

- Symbols are hand-mangled. (Why? Wasn't that the whole point of C++ namespaces?)

- There are no comments. I'm guessing the reason for this is that C++'s idiotic build process dedicates an insane amount of CPU time to redundantly reparsing the std header files over and over, forever. This has a very real performance impact across the ecosystem. Either that or the authors just don't believe in commenting their code.

- Templating seems to make the code even more unreadable. Which is strange, because rust's standard library also uses generics and yet it is totally readable.

- Different C++ compilers have different implementations of the standard library, with different performance profiles and quality standards. There is no good reason for this.

> Like, from your argument one wouldn't be able to learn how to use the Win32 API or Cocoa API since the operating systems using them are closed-source and you cannot see how they are implemented & used by the teams who develop these APIs.

This wasn't the central argument, but you're still kinda right about this!

I don't have much experience with windows, but I can tell you from my personal experience that it is significantly harder to understand Apple's platform APIs because the code is closed source. There are lots of important methods in Apple's APIs with obscure, technical names and next to no documentation. You have no idea what they do, or if they'll solve the problem you're facing. Its crazy frustrating. When working with opensource code (eg rust, javascript, java, etc) I'm constantly reading the source code of library functions I call to understand how they work and what they do. Its like a backstop for documentation. If the docs are missing or not good enough, having the code available means I can still almost always figure out how to solve my problem.

I can't find it now, but there was a comment thread about windows engineers a few decades ago admitting they made some APIs obscure and badly documented on purpose so they could make money writing and selling technical books on the side on "Windows Internals". Because the source code wasn't available and documentation was shoddy, you needed to buy those books in order to understand and use some of the windows APIs correctly.

So yes, library code should be readable. When you can't read the libraries you're using, it causes all sorts of problems.

I work on clang and don't know go yet still find the go version easier to read.
I think the c++ version could be more understandable but it’s as if the authors intentionally made it as obtuse as possible.
The authors are required to make it obtuse. They're required to use warts on all of the names because most of the code is in the head files and is generative code compiled by users of the library rather than the vendor. In order to avoid naming conflicts they can only use obscured names in their implementation of any but the defined API (eg. naming any internal functions, macros, or variables with leading underscores).

So, the authors did intentionally make it as obtuse as possible for your benefit. It's written to be used, not studied, by all kinds of developers in all kinds of circumstances.

They could supply a "pretty" version for people who want to review it. Every time I have to step through code (and accidentally step into STL code) it looks sloppy and gross, like a swamp. No comments or organization. I would expect something neatly formatted, and comments saying "This is overload-4 of std::copy()..." etc.
Professional software developers have a lot to do just to get their job done on time and within budget. Having to duplicate all their code just so that people who contribute nothing to the end product can have an easy time understanding it is just never going to be a priority worth addressing.

The problem here is not really the code, it's the reader.

Who says they have to duplicate it? Just write the original version clean, clearly, and concisely. Then run it through a mangler to rename variables to avoid collisions.

The STL is maintained by volunteers, it's a FOSS project. So your appeal to Serious Business doesn't hold.