Hacker News new | ask | show | jobs
by jstimpfle 2471 days ago
I don't get it. A linker's task should be straightforward. In essence, it looks up addresses from strings, whose number is bounded by the lines of code written by humans. I think that there must be a lot of incidental complexity if that task somehow becomes a bottleneck.

And how can it be that a binary called "cmd/compile" has 170k symbols (that's like, global definitions, right?). Not that that's a huge number in terms of today's computing power, but how many millions of lines of source code does that correspond to?

Still, 1M relocations, or 34MB of "Reloc" objects, as indicated, shouldn't be a huge issue to process. Object files should have minimal parsing overhead. Is there any indication how long this takes to link? Shouldn't 1 or 2 secs be sufficient? (assuming 100s of MB/s for sequential disk read/write, and < 1us to store each of the 170k symbols in a hashmap, and < 1us to to look up each of the 1M of relocations).

- I don't think mmap should be used if it can be avoided. It means giving up control over what parts of the file are loaded in memory. And from a semantic point of view, that memory region still must be treated differently, since on-disk and in-memory data structures are not the same.

4 comments

> that's like, global definitions, right?

No, not just global definitions. Closures need linking, too. But the linker is doing much more than linking function entry points. Many automatic (on the stack) variables need linking so the GC can (a) trace the object graph and (b) move them when resizing the stack. Likewise, type definitions require metadata generation for GC tracing. And then there's all the debugging data that needs to be generated, which basically involves everything.

Go currently does DWARF generation in the linker, but is in the process of moving that into the compiler. Also, most closures linking problems are relatively easy in Go.

As far as stack variables for the GC, the stack is exactly scanned for all but the last frame (I believe), but conservatively scanned for the last frame. This makes it much easier.

Types -- you're partially right. It is mostly all generated in the compiler, and deduped by the linker.

There's no reason Go's linker couldn't be much simpler and likely even simpler than a standard C linker. You might argue that Go will give up things like LTO, but Go designs out lots of the LTO problem (not PGO) by nature of how packages work, and the fact that there aren't cyclic dependencies.

Overall, the Go linker could be quite simple. It just needs some rework is all. :)

What do you mean by closures, and by linking of closures?

[I've implemented a Scheme before, in C, for context, so I'm wondering if I'm reading what you wrote with common vocabulary.]

The OP was referring to global symbol definitions and the first thing that came to mind were closures, which are generated as implicit global function definitions taking a context argument. Because Go supports lexical closures as first-class objects, they're not uncommon in typical Go code. (Though my experience is limited as I don't use Go regularly.) Point being, there can be hidden, effectively global function definitions. I'm sure that only accounts for a small fraction of the total symbols; it's just the first thing that came to mind, and one of the easiest to understand if your notion of a linker comes from what a Linux runtime linker does for C ABI-based object code.
I read that mostly as making sure capture is handled properly.
I too often don't see why a thing would not be simple and straightforward, until I've done it a few times.
> I don't get it. A linker's task should be straightforward.

In Go the linker also generates DWARF, does deadcode elimination and a bunch of other stuff described in the associated document.

> And how can it be that a binary called "cmd/compile" has 170k symbols (that's like, global definitions, right?). Not that that's a huge number in terms of today's computing power, but how many millions of lines of source code does that correspond to?

As described in the link each global function actually ends up producing 4 symbols.

> Shouldn't 1 or 2 secs be sufficient?

On my system linking cmd/compile takes 1.3 seconds. The problem is that object files get cached so if you have a hot cache (let's say you made a few changes inside a single package and then recompile) compiling takes almost no time and linking accounts for nearly 100% of the build time.

One nice speed improvement for linking C++ (and I imagine C as well) with GCC and Clang is -gsplit-dwarf which does not link debug information but instead only references the original object files.

This of course only makes sense for development where the object files are still available for GDB to load the debug symbols from but it is a nice feature.

The MSVC linker has /DEBUG:FASTLINK as well.

I don’t know how advanced the go linker is, but the distinction between compiler and linker gets blurred when one adds link-time optimization (https://llvm.org/docs/LinkTimeOptimization.html, https://gcc.gnu.org/onlinedocs/gccint/LTO-Overview.html, https://en.m.wikipedia.org/wiki/Interprocedural_optimization)

Even if it doesn’t do that, fixing up various addresses can be a lot of work if the linker can make code shorter (e.g. by using short branches where possible), or if it tries to increase cache locality by changing function order.