| Another thing I noticed in the revised blog post on a second skim, regarding this claim: > Every time, 70% of a couple hundred megabytes are copied around for no good reason and someone needs to pay ingress/egress networking costs for these file copies. That 70% includes the ELF/DWARF metadata that is easily removed from the binary using strip. It's true that the DWARF info in particular has gotten larger in recent releases, because we've included more information to make debuggers work better. I don't think it has grown quite as rapidly as the table indicates - I think some of the rows already have some of the debug metadata removed in "Raw size". Regardless, I would hope that anyone sensitive to networking costs at this level would be shipping around stripped binaries, so the growth in accurate DWARF info should not be relevant to this post at all. That is, the right comparison is to the "Stripped" column in the big table. If you subtract out the strippable overheads and you take the "Dark + pclntab" as an accurate representation of Go-specific overhead (debatable but not today), then the situation has actually improved markedly since Go 1.12, which would have been current in April 2019 when the first post was written. Whereas in Go 1.12 the measured "actual program" was only about 40% of the stripped binary, in Go 1.16 that fraction has risen to closer to 55%. This is a marked-up copy of the table from the dr-knz.net revised post that at time of writing has not yet made it to cockroachlabs.com: https://swtch.com/tmp/cockroach-blog.png I think the numbers in the table may be suspect in other ways, so I am NOT claiming that from Go 1.12 to Go 1.16 there has actually been a 15% reduction in "Go metadata overhead". I honestly don't know one way or the other without spending a lot more time looking into this. But supposing we accept for sake of argument that the byte counts in the table are valid, they do not support the text or the title of the post. In fact, they tell the opposite story: the stripped CockroachDB binary in question has gotten smaller since April 2019, and less of the binary is occupied by what the post calls "non-useful" or "Go-internal" bytes. |
> I would hope that anyone sensitive to networking costs at this level would be shipping around stripped binaries, so the growth in accurate DWARF info should not be relevant to this post at all.
Good point. I removed that part from the conclusion.
> If you subtract out the strippable overheads and you take the "Dark + pclntab" as an accurate representation of Go-specific overhead [...] then the situation has actually improved markedly since Go 1.12 [...] Whereas in Go 1.12 the measured "actual program" was only about 40% of the stripped binary, in Go 1.16 that fraction has risen to closer to 55%.
Ok, that is fair. I will attempt to produce a new version of these tables with this clarification.
> the stripped CockroachDB binary in question has gotten smaller since April 2019, and less of the binary is occupied by what the post calls "non-useful" or "Go-internal" bytes.
There's an explanation for that, which is that the crdb code was also reduced in that time frame.