Hacker News new | ask | show | jobs
by TuxSH 335 days ago
> This design decision at the source level, means that in our linked binary we might not have the logic for the 3DES building block, but we would still have unused decryption functions for AES256.

Do people really not know about `-ffunction-sections -fdata-sections` & `-Wl,--gc-sections` (doesn't require LTO)? Why is it used so little when doing statically-linked builds?

> Let’s say someone in our library designed the following logging module: (...)

Relying on static initialization order, and on runtime static initialization at all, is never a good idea IMHO

6 comments

There's also the other 'old-school' method to compile each function into its own object file, I guess that's why MUSL has each function in its own source file:

https://github.com/kraj/musl/tree/kraj/master/src/stdio

...but these days -flto is simply the better option to get rid of unused code and data - and enable more optimizations on top. LTO is also exactly why static linking is strictly better than dynamic linking, unless dynamic linking is absolutely required (for instance at the operating system boundary).

Or plugins, or hotcode reload techniques, unless people want to go the OS IPC route that seems forgotten to many, while being safer, but naturally more resource demanding.
Yes, these are really esoteric options, and IIRC GCC's docs say they can be counter-productive.
One engineer's esoteric is another's daily driver. All 3 of those options are borderline mandatory in embedded firmware development.
These options can easily be found by a Google Search or via LLM, whichever one prefers

> they can be counter-productive

Rarely[1]. The only side effect this can have is the constant pools (for ldr rX, [pc, #off] kind of stuff) not being merged, but the negative impact is absolutely minimal (different functions usually use different constants after all!)

([1] assuming elf file format or elf+objcopy output)

There are many other upsides too: you can combine these options with -Wl,-wrap to e.g. prune exception symbols from already-compiled libraries and make the resulting binaries even smaller (depending on platform)

The question is, why are function-sections and data-sections not the default?

It is quite annoying to have to deal with static libs (including standard libraries themselves) that were compiled with neither these flags nor LTO.

Esoteric? In embedded, people know of these from BEFORE they stop wearing diapers
-ffunction-sections has 750k hits on github. It is among the default flags for opt mode builds in Bazel. There are probably people who consider them defaults, in practice.
Well, C and C++ together have around 7M repos, so about 10%. Actually not entirely esoteric, but Github is only a fraction of the world's codebase and users of these repos probably never looked in the makefile, so I'd say 10% of C/C++ developers knowing about this is a very optimistic estimate.
Looking at GitHub is probably significantly undersampling the kinds of C projects that would be doing static linking, many of which pre-date GitHub.
It's heavily use in embedded only
> Do people really not know about $OBSCURE_GCC_FLAG?

Do you know what you sound like?

If your whole business revolves around shipping static libraries to customers, then surely reading the man page to work out how that can be done is part of the job.

Also, having your library rely on static initialisation doesn't seem like a very sound architectural choice.

Honestly, this just sounds like whining from someone who can't be bothered to read the documentation, and can't get the coders to take him seriously. In the time it took him to write that article, he could have read up on his tools, and probably learned some social skills too.

And do you know how you sound like when you call a well known, first hit on Google/AI, set of flags as obscure?
https://xkcd.com/2501/

... It's easy to forget that the average person probably only knows two or three linker flags ...

How can they be expected to learn this, when it is now fashionable to treat C and C++ as if they are scripting languages, shipping header only files?

We already had scripting engines for those languages in the 1990's, and the fact they are hardly available nowadays kind of tells of their commercial success, with exception of ROOT.

It makes more sense for c++ due to templates, but the header only C library trend is indeed very strange. It's not surprising that people are coming up now who are writing articles about being confused by static linking behavior.
Even with C++ templates, if you want faster builds, header files aren't the place to store external templates, which are instantiations for common type parameters.
Header-only is simpler to integrate, so it makes sense for simple stuff, or stuff that is going to be used by only one TU there.

However, the semantics of inline are different between C and C++. To put it simply, C is restricted to static inline and, for variables, static const, whereas C++ has no such limitations (making them a superset); and static inline/const can sometimes lead to binary size bloat

It's not strange at all. You only have one file to keep track of and it does everything, you put the functions in any compilation unit you want, C compilation is basically instant, and putting a bunch of single file libraries into one compilation unit simplifies things further.
That might be why back in 1999 - 2002, I was waiting around 1h for each OS build variant of our product, a mix of Tcl and C native libraries, super fast.

It is only basically instant in toy examples, or optimizations completely disabled.

Sqlite is 6 MB and can compile in 2 seconds on msvc.

It's 25 years later, if you are waiting for an hour to compile a single normal C program, there is a lot of room for optimization. Saying C doesn't compile fast because 25 years ago your own company made a program that compiled slow is pretty silly.

Single file C libraries are fantastic because they can easily all be put into more monolithic compilation units which makes compilation very fast and they probably won't need to be changed.

Have you actually tried what I'm talking about? What are you trying to say here, that you think single file libraries would have made your 1 hours pure C program compile slower?

I used to compile sqlite regularly on msvc and it was more than 2 seconds. If this is a measurement it is a recent one with recent hardware.

Sqlite is a single compilation unit for much very different reasons than the header only libraries by the way. It's developed as many different files and concatenated together for distribution because the optimizer does better for it that way.

> It makes more sense for c++ due to templates, but the header only C library trend is indeed very strange.

This is a symptom of the build tools ecosystem for C and C++ being an absolute dumpster fire (looking at you CMake).

> How can they be expected to learn this

It's the first thing Google and LLMs 'tell' you when you ask about reducing binary size with static libraries. Also LTO does most of the same.

To learn, first one needs to want to learn, which was my whole point.
Agreed, but article's author mentioned this as an issue, I would have expected him to find about and mention these flags as well.
An STB-style header-only library is actually quite perfect for eliminating dead code if the implementation and all code using that library is in the same compilation unit (since the compiler will not include static functions into the build that are not called).

...or build -flto for the 'modern' catch-all feature to eliminate any dead code.

...apart from that, none of the problems outlined in the blog post apply to header only libraries anyway since they are not distributed as precompiled binaries.