Hacker News new | ask | show | jobs
by 201984 5 days ago
Shared libraries (and mmapped files in general) are deduplicated; it's nowhere near as bad as you think. The kernel loads a .so into memory once and then maps that memory into every process that mmaps it.

Editing to add: this deduplication is one of the greatest upsides to dynamic linking. Common libs like libgcc and libc only have to exist in memory once and can stay in CPU caches, whereas if they were statically linked into every binary, each binary would have a copy of that library that wouldn't be shared with anything else and you'd waste a lot of memory.

1 comments

Doesn't the loaded code have to be patched for relocations?
It does, so not 100% is reused. The patched parts are in different sections though, so the entire .text (code) section ends up being reused.
Not on modern archs that provide decent support for PIE (position independent executables).
How do you think position independent code can call functions from other .so's without being patched with their addresses?

They can't, so even PIC code still has to have a relocation table that gets patched. It's in a different page than the code though, so code does still get reused.

That's not really patching though, any more than any use of function pointers is patching.
There's a part of the .so ELF file (the Global Offset Table aka GOT) that has to be modified with all the addresses of the functions being imported, which of course vary from process to process.

If not patching, what exactly would you call modifying part of the file?

And the got is just a big table of pointers like any other table of pointers your application manipulates as it runs.

This isn't meant as a reductive take, but instead that there is a difference between completely describable in C like the contents of the .got section, and something like a .reloc section that actually has to understand the generated assembly in order to build the relocation table to load and link the executable. Both are linking, but I've saved "patching" for more brain surgery esque techniques. Like on mips, the jump instruction immediate is the bottom 26 bits of the absolute address of the target, so you're going through and modifying all of the jump instructions if you load it to somewhere it wasn't linked at.

Not if it's position-independent.