Hacker News new | ask | show | jobs
by _bxg1 2377 days ago
Question: I thought any given C code could go in either headers or c files (or rather, split between headers and c files), and that the difference was only a build concern. So why wouldn't a given library be available in both forms, unless one of them just makes no sense at all? Put differently: why isn't this just "Minimal library for writing non-blocking HTTP servers in C" which people understand to mean "this might make sense to put in a header"?
11 comments

"header only" is C-speak for "up and running in 15 seconds" more or less. One major concern with integrating a new dependency in C land is how much of a pain in the ass it makes your build, especially considering a large chunk of C projects do not target typical desktop environments. After the first time you lost a day trying to get some batshit custom build tool to emit the correct link flags for your exotic ARM board, "header only" carries a lot of weight.

On the intermixing of source and header files, it doesn't quite work the way you mentioned. It's not possible to just cut-paste any old public object definition into a header file without at least applying 'static' to it, and that only works where the object truly is private to each translation unit. It's not so easy e.g. to instantiate a global variable shared across the whole program this way, so 'header only' also implies 'almost certainly free of globals', which is a good sign of hygiene

edit: yikes, in this case, the implication is totally invalid

Hijacking your comment to ask my question below. Apart from the possibility you can't do header-only for everything (like something that requires a singleton for example) and the other known issues (build times, ballooning of object files and thus executables) is the only other reason you don't see as many header-only libraries in C (like you do in C++ where they are ubiquitous) is that people merely don't do it in C? That is, a social landscape reason, not a technical reason.
It's partly a historical reason as well. There are many projects with very complex Makefiles. Those are hard to integrate into your own project.

Header-only is the other extreme: No makefile at all. With package managers like Conan and vcpkg, using a build-system like e.g. CMake, it's possible to have very simple and short project files which are easy to integrate.

In this regard, C and C++ are a bit behind the times compared to other languages.

Isn't C++ supposed to be getting proper namespaces soon?
I am not an expert in C++, but there exists an opinion that it will not help as much as you might think.

https://cor3ntin.github.io/posts/modules/

The solution for the "singleton problem" is to put the implementation code into a separate part of the header behind an #ifdef XXX_IMPL, and only define this before including the header in one place in the project.

This is also the approach used by this library, see:

https://github.com/jeremycw/httpserver.h#example

This approach also doesn't have the 'build time problems', since the (expensive to compile) implementation code is only seen once by the compiler (unlike many C++ header-only libs, which rely on inline code)

The fact that you cannot define templates out of headers also explains why C++ has a lot of header-only libraries. : if you use templates and have no global references, you are most likely already header-only.
Yeah, templates forces many C++ libraries to be header only. I suspect this came first, and only after a c++ header-only (because they had to be) libraries existed did people realize there were advantages to header only and so people wrote them where they wouldn't have to.
Yes, the social reasons are the only ones, besides the technical reasons :)

An additional technical downside in this case is the code complexity in the client code, from the required modal preprocessor flags.

"header only" is more like "I don't want to learn"-speak in regards to deal with native language toolchains.
The native toolchain on my embedded board is a mess. You don't want to learn to deal with it if you don't have to. This comment applies to every embedded toolchain I've ever worked with. Even doing embedded linux with yocto is a mess of a toolchain and it is the best attempt I've ever seen at creating a good embedded toolchain. The problem is just messy.
Usually that boils down to "we don't want to bother with the vendors IDE", e.g. MPLAB.

I know C and C++ since 1992, used them across multiple OSes, hardly seen anything that would motivate the fashion of header only files.

You've never in 28 years lost a single day to a compiler/linker flags/version mismatch and some random tool? I find that difficult to imagine. The horrifying alternative would be that such situations had been encountered but considered justifiable productive work
No, because I either used what was provided by the OS vendor, or library providers that were selling libraries for the deployment we wanted to do.
Well, you can do:

    int a;
    int a;

    static int b;
    static int b;
And it's a valid way to define a single global/static symbol called `a`/`b`.
They have different semantics. The static one can only be referenced in the current translation unit, whereas the non-static one is a global which can be referenced from any translation unit.

The problem is that in the header-only case defining "int a" means you can only import the header from a single source file. With "static int a" the visibility is wrong. And if you did "extern int a" linking would fail because the symbol is never assigned to a translation unit.

Header-only libraries sometimes require you to define those externs in exactly one "impl" file, which you compile and link to your artifact. Something like this:

    // libfoo_impl.c
    // or in some other translation unit, such as main.c
    #define LIBFOO_IMPL
    #include "vendor/libfoo.h"

    // main.c
    #include "vendor/libfoo.h"
    
    int main() {
      foo_inc();
      printf("%d\n", foo_count);
      return 0;
    }

    // vendor/libfoo.h  
    #pragma once
    extern int foo_count;
    void foo_inc() {
      foo_count += 1;
    }

    #ifdef LIBFOO_IMPL
    int foo_count = 0;
    #endif
main.c and libfoo_impl.c both include libfoo.h, which declares foo_count with the right linkage, but the variable is only defined once, in whichever translation unit defines LIBFOO_IMPL.

Occasionally you'll find a header library which supports this _IMPL paradigm as an option to avoid inlining its functions in every translation unit that calls them.

You're misunderstanding. I'm saying you can have multiple identical `int a;` across multiple translation units and it will point by default to a single global variable across the program. So you can include a single header from multiple places, and you just get a single shared global variable, if the header contains `int a`.
This is really two files in one. If you have

  #define HTTPSERVER_IMPL
prior to the inclusion, then the header provides the implementation too, otherwise only the interface. Obviously, you must have this HTTPSERVER_IMPL followed by #include "httpserver.h" in only one file in your program.

If someone doesn't like it, they can split the file into two: the header proper and the "impl" part in a .c file.

Just look for the part starting with #ifdef HTTPSERVER_IMPL. Take that out through to the closing #ifdef and put it into a httpserver.c file. Then put #include "httserver.h" into the top of that file, and remove all *_IMPL preprocessor cruft from httpserver.c. Now you have a proper single-header, single C file module.

This type of architecture makes more sense when you use it to let the user give arbitrary names and types to the functions, as with e.g. KHASH_INIT(my_t). This version with #ifdef seems a little cargo-culty since the result isn’t as polymorphic as you’d get with klib (which afaict kicked off the header-only craze) so the advantage over a traditional two-file setup is not obvious. (Of course, I don’t know why you need more polymorphism in a small http server!)

As noted above, it avoids version incompatibility by essentially forcing static linking. But most developers would statically link a small two-file library anyway, so it’s a moot point. C isn’t supposed to have training wheels.

>C isn’t supposed to have training wheels.

And then there came Arduino.

We really need to fix the lack of documentation / warnings for these generic C libraries. If you need to understand token-combining preprocessor magic (looking at you, kbtree) due to a lack of proper documentation, these newbies _will_ just try to wing it and _will_ stop poking as soon as it seems to work, regardless of why/how. Say hello to use-after-free & it's cousin, memory leaks.

Thankfully Arduino uses C++, there are no training wheels beyond taking advantage of C++ improvements over C, and a beginners friendly IDE.
For instance, I used this trick to create a shared library wrapper (something that calls dlopen, but provides stubs so the program thinks it's just calling a build-time-linked library).

Macros defined the individual functions themselves, in one place, (what are their names, arguments, and which library). A regular #include of these macros provided the declarations to the program. The shared lib wrapper module #defined the definition version of these macros prior to the #include, which expand to the invocation stubs.

It makes sense because the users who maintain it just have one place to add a new function; they don't have to do copy and paste programming to write the invocation wrapper.

To elaborate on the other responses, “single-header” means “one-line import into your project”, and since c doesn’t have a standard module system, the alternative means hooking up the library’s module system to yours, or otherwise separately building and linking it to your project.
> the alternative means hooking up the library’s module system to yours

What "module system"? The alternative would be one .c file and one .h file. It's negligible effort to add these to any project that already has more than one .c file.

I meant build system, such as cmake. I assume the alternative to a single-header library would be lots of .c and .h files, and something to build them into a dynamically-linked library. The alternative, just dumping them into your project, means that once you get into .c files you typically have to start enumerating those somewhere in your own build system, rather than just including them from one of your own headers.
Besides, even for a single translation unit (such as a quick hello world) you could just #include the .c file to avoid setting up a makefile.
Short term convenience

Long term technical debt

How much of that comparison ignoring the time sink that is setting up linkers and dealing with that whole lotta mess.

Header only works for some things (particularly things that require no globals, singletons, etc) and that's a valid concern. Saying header-only = long term technical debt always (or even most of the time) feels like an assertion because I've only heard hypotheticals around why it is bad.

Is it really a time sink? You have to have a build system anyway to link together your project itself, assuming it's bigger than a single-file hello world. One extra line in your build system should be the least time consuming part of adding a new foreign dependency--you still have to vet the code for security and figure out how to use it.

Unless you're using C++20 modules, you also have to deal with possibly including the header multiple times (slowing down builds), namespacing, macros potentially defined by the header, or a bunch of external/internal linkage edge cases. You only ever find out about these problems once it's too late to remove that library for a different one.

> One extra line in your build system should be the least time consuming part of adding a new foreign dependency

Could, should. IRL Docker became a thing mostly because of the hassle it is to do so in C.

It depends. But if that header only library is now in your hands - good luck modifying it without paying the penalty of recompiling anywhere else it's used.

Also, now have to prefix the hell out of it, as even anonymous namespace won't work. And not only that.

It feels like more being lazy not to learn the proper way of doing it.
The proper way to do it sometimes is horrendously complex or uses a build system different from the one you use. Header only avoids this usually.
Building a plain .a/.lib or .so/.dll is not horrendously complex.
Package management is often hideously complex and a central sink of labor effort in linux distros. Building it for yourself is easy, in practice using dynamically linked libraries are the start of all sorts of troubles.
I just watched a cppcon video where Bjarne himself lamented about the complexity of installing many libraries for use in C++.
I’ve personally had great difficulty building these for large libraries, especially when they use a build tool different than the one I’m familiar with (I use cmake for C++ these days but I find cmake itself horrendously complex and difficult to understand even after investing significant time into it).
Header only fixes the lack of build system/module system as part of the language. Some popular build systems include: make, cmake, Visual Studio, Xcode project. Each has a different format for specifying source/header/libraries - and even if you provide project files for all of the above, then what about the other 1,000 build systems?

Header only is nice and easy, dump file in the project, #include and your job is done.

Beside build consideration, you get a bit more performance out of headers-only libraries and single, large c files.

That technique is used in sqlite. https://www.sqlite.org/amalgamation.html

> Combining all the code for SQLite into one big file makes SQLite easier to deploy — there is just one file to keep track of. And because all code is in a single translation unit, compilers can do better inter-procedure optimization resulting in machine code that is between 5% and 10% faster.

Another key use case for header-only libs is for games and projects where you need to do rapid prototyping. A lot of game utilities are distributed this way, e.g. https://github.com/RandyGaul/cute_headers
Even though I dislike C, I have my share of years coding in C, from all the issues I have with it, lack of rapid prototype was never one of them.
This is not the case due to the single definition rule.
In a C compiler (cc command), there are only compilation units (.c -> .o). Headers (.h) were bolted-on (cpp command) to deduplicate interfaces (forward declarations).

This header-only containing implementations adds its code to every compilation unit and the linker has to be smart enough to deduplicate identical symbols across many compilation units. Linkers (ld command) are provided either by the platform vendor or the compiler vendor, so they may or may not be able to do this. Binutils on Linux is able to do so IIRC.

It's a terrible practice that promotes cowboy coding. It should just use two files, which would cause less problems in the real world.

You can also write C without headers or newlines. IOCCC may welcome it but why would you ship it or encourage people to use it?

> This header-only containing implementations adds its code to every compilation unit and the linker has to be smart enough to deduplicate identical symbols across many compilation units.

That's not how this works. It's actually a pretty clever trick, and it effectively behaves as a .h / .c pair of files. See https://github.com/nothings/stb/blob/master/docs/stb_howto.t...

“Clever trick” should be treated as a pejorative.

Knowing why I have to `#define FOO_IMPL` before I include the header, but only from one of my compilation units, is a “clever trick” that now anybody reading or maintaining my code has to tuck away into their brain as well. Along with all the other tricks they’ve had to memorize because some other asshole thought they were being clever too.

C/C++ lack the niceties of npm/cargo/pip/etc, so headers only is like wget something.h and gcc mystuff.c, viola magic.

It's a symptom of the tooling that this is even a thing.

There are mature solutions for building C/C++ programs which check for dependencies and provide a high quality configuration setup. Think of GNU Autotools. It is often blamed to be too complex, but this complexity is essential, not accidental. You do not want to abstract away details a C/C++ maintainer cares about.
It’s not that autotools is complex - rather, most of what it does hasn’t been useful since nineties, and when it doesn’t work - which is quite often when it’s being used on a platform a given piece of software hasn’t been tested on - makes fixing the build harder compared to hand-written makefiles.
Right, but that still doesn't solve the problem of actually distributing the library in the first place.

As a library author you basically have to wait for distributions to start shipping it, or make users manually install dependencies (and then make the build work with libraries installed into /usr/local).

Right, this is an extraordinary pain point as a library author which writes libraries in C or C++.

That's assuming the distro even does package it up. Because beyond Archlinux packaging, packaging up a DEB is painful. RPM is somewhat less painful, but you still need to contend with N distros. Your lib depends on lets say 4 other libraries, perhaps some of the GTK ecosystem of libraries. Good Luck!

Autotools doesn't help you solve the issue of you needing version X.Y when Z.X is installed, neither do RPM/DPKG/etc.

When you go to build C projects, there's usually a long list of instructions, per distro/os to set the environment up correctly.

Compare that to a Rust project where you git clone and cargo build to get going. No instructions needed, no insanely complicated esoteric m4 macro system, or ad-hoc programming language (looking at you CMake), and snowflake package manager needed.

Meson is the only C/C++ build tool I know of that solves the issue of building complex C/C++ programs with dependencies sanely. Especially in the case where both your application and your dependencies are unreleased as of yet, as is the case with Gnome and KDE applications as they are developed.

> niceties of npm

Shudder

It seems more like being lazy not to create an RPM/DEB/MSI/PKG/DEPOT,... to me.

Or even plain old tarball with any sort of build script.

And vcpkg, cargo are already good options for those that need cross-os package manager.

npm for C is apt/brew
Not really, because the amount of provided package versions is very limited. You won't get an update unless the package maintainer (who is likely not the library owner) provides an update.
That's a feature, not a bug. I want to rely on well tested, stable, mature libraries that receive security updates and don't change for the next 3 to 5 years.
npm for js is apt/brew for os
"dnf install" or "apt-get install" work fine.
System package managers rarely offer all of the packages you need.
So .. add them? They can even be added automatically, as we do for Perl and Tex packages in Fedora.
You're right. "Header only" libraries are an anti-pattern in C and the fastest way for me to close the tab when considering a new library. Just put it in two files and it's still dead easy to incorporate into a project.
I don't really understand what your criticisms are. There is literally no functional code difference between a single header library and a two file library. Your aversion is as arbitrary as saying you don't consider libraries if the API uses camel case.
Other than not being able to use pseudo-modules in C, using a gigantic header file instead.
Can you compile a header to an object file so that its definitions don't have to be recompiled every time the including file is recompiled?
Eh... Yes?

GCC [0], Clang [1] and others [2] have supported for compiled headers. Cmake also has support for precompiled headers [3].

I would say both the tool and architecture to do that is well supported.

[0] https://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html

[1] https://clang.llvm.org/docs/PCHInternals.html

[2] https://www.qnx.com/developers/docs/6.3.2/neutrino/utilities... (-pch flag)

[3] https://cmake.org/cmake/help/latest/command/target_precompil...

I can sort-of understand (and still disagree) for header-only template libraries. But in general yes I agree, "header-only" is an anti-pattern for high quality C and C++ projects.
> "Header only" libraries are an anti-pattern in C

says who? care to elaborate?

Separating the interface from the implementation details? Isolating the server code from the module that uses it? Because the implementation is in the header, including it defines a bunch of macros that could easily collide with macros in the larger project.

Not knocking the overall effort here, but yes, "header only" libs are definitely an anti-pattern in real world C. I assume the author considers it a curiosity, a novelty, or something that would only be used in very limited circumstances that make such a design an advantage.

Only one inclusion expands to produce the implementation.
Still suffers from the isolation problem. For example, if you happen to define HTTP_BODY in your own code, it segfaults.

This can be avoided by being careful to only define HTTP_BODY after including httpserver.h, but avoiding this type of thing to worry about is the entire point of interface/implementation separation.

This only causes a problem if you define HTTP_BODY in the same file that you define HTTPSERVER_IMPL when including httpserver.h
How is that different.from anything else in C or C++? What does that have to do with single header libraries? Preprocessor isolation is just now possible with C++ modules.
100% agree with you. I really think adding a source file to your build tool of choice (Make, CMake, Visual Studio project) is a basic fundamental skill that a C or C++ developer should know.
C is already an anti pattern for a library.

It is soon 2020, there is no reasons to use C in new code. Because of security and also convenience.

Maybe it is a nice mental exercise (also try brainfuck and malbolge), but not something for production.