Next question: why are there two calls to `malloc()`, one for the Pool structure and one for the Chunks?
It is trivial to co-allocate them which removes the risk of memory fragmentation and is just generally a good idea if you're chasing performance.
The answer might be "for clarity/simplicity" which I guess is fine in an informative article, but it could at least be highlighted in the text which I didn't see.
He's targeting ANSI C, C89. Flexible array members are not supported officially until C99. 26 years later, it's time to stop clinging to outdated standards and crusty tooling. Even Linux had to read the room and adopt C11.
A C11 implementation could go one step further and use _Alignas(max_align_t) to keep the pool array aligned with no manual effort. The double allocation does this implicitly.
on the topic of alignment, the library (libpool) fails to align chunk_sz to allow storing a pointer when in the free_chunk list.
This issue is sidestepped in TFA by using a union, which ensures appropiate alignment for all it's members, but in the library there's nothing which prevents me from asking for a chunk size of, say, 9 bytes, which would store pointers at misaligned addresses when creating the free-list.
Mainly just the size and alignment. If you make the structure oversized, and with strict alignment, it can be future proofed. Old binary clients will provide a structure big enough and aligned enough for the new version.
The dynamic constructor API can allocate the structure exactly sized without the padding.
Is that why some structs have "char reserved[16]" data members? I think I saw this in NGINX, though that might have been to allow module compatibility between the open source offering and the paid version.
Yes, that slightly improves ABI flexibility, though the fact that callers can still access fields limits that.
An alternative is to make the only member `char opaque[16]` (IIRC some locale-related thing in glibc does this, also I think libpng went through a transition of some sort related to this?), but calculating the size can be difficult outside of trivial cases since you can't use sizeof/offsetof.
Couldn't this also just syscall mmap directly? I mean malloc itself is a memory allocator it feels a bit strange to use it in an allocator implementation, but perhaps I'm missing something.
It is trivial to co-allocate them which removes the risk of memory fragmentation and is just generally a good idea if you're chasing performance.
The answer might be "for clarity/simplicity" which I guess is fine in an informative article, but it could at least be highlighted in the text which I didn't see.