Hacker News new | ask | show | jobs
by parhamn 1886 days ago
> In other words, the Go team decided to make executable files larger to save up on initialization time.

I mean... Im genuinely curious if this is a "we have extra engineering resources and can explore/complain about this" or "we have a client who is running cockroachdb and can't handle a 172mb binary install for a database server".

Is there really someone out there who installs Cockroach (a global distributed auto-sharded database) and thinks twice about 172mb of disk space?

Sure, it'd be nice to have smaller binaries but outside of some embedded applications Go's binaries sizes are well within the nothing-burger range for most compute systems.

4 comments

It affects the download and instantiation time for containers.
The article says they don't really care about initialization time though, which is right.

Remember: cockroachdb is always synchronizing data across the cluster, that 175mb of ingress to start up a DB node, probably pales in comparison to data synchronization/relocations that happen on a cluster. Which is why worrying about ingress/egress costs over binary size is nonsense here too.

The bandwidth you need to run a distributed database cluster could download 172mb binary in milliseconds. If your node initiation time for DB failovers needs anything faster, you're doing something wrong.

There are stakeholders to this problem, Cockroach probably isn't one.

For production, yes. But it also effects startup and download time on developer machines. Want to have multiple versions installed? Now it takes more space. Takes longer to download on 4g while on the road or on crappy corporate/conference WiFi, etc .

In the end this all ends up because it is for all go binaries. I’ve come appreciate attention for leanness because in the end it does add up.

Here is a question: imagine you could double performance of cockroa hDB by making the executable 2000MB - every db admin would make that choice
Yes, if it would double the performance in many cassis, then the size would have a significant benefit.

Here we are taking about large binaries without apparent benefit and a drive to keep binaries as lean as possible.

This is effectively what is happening btw; the crdb binary went from 80MB to 200MB in the same time it took to make it twice as fast. The % growth in size is not a problem on its own; it's more the % size attributed to the program vs. the % size attributed to unclear purposes, that's a problem.
What's the use case where a 200MB binary size is a problem?
I worry about that for 10gb omnibus containers, not so much for <500mb.
172MB for a Docker image that contains a DB is pretty small, Go Docker image are among the smallest because Go binary can run with minimal deps.
This isn't an interesting or convincing argument because it revolves around a non-satisfiable metric. No matter the size of an artifact, this line of reasoning can always be used to claim it's too big and needs to be smaller.
One word: trade-offs.
Go binary size makes it not an option for wasm.
There are go variants for this. See tinygo [1], which targeted embedded originally (iirc) but now also targets wasm.

So you’re definitely correct that core Go is not an option, but options exist within the “greater metropolitan area” that’s built up around downtown.

These are among the benefits of having a relatively simple language and a 1.0 compatibility commitment, I think.

[1]: https://tinygo.org/ their FAQ is quite good.

Tinygo is severely limited, e.g. it doesn't support encoding/json. Want to deal with any JSON API, or anything that indirectly uses JSON serialization? Forget about it.

A current side project of mine uses golang's wasm. Not a big codebase, but the wasm is 2.7MB brotli'ed. Certainly huge to me (I'm sure it's almost in the lean and mean camp compared to the average website today, though).

Yeah, tinygo was originally intended for embedded, so the good news is that many significant wasm binary size optimizations are still to come, I think.

Today, there are ways to get the binary smaller, but they involve avoiding certain libraries or preferring certain implementations of common libraries over others.

It’s definitely an art that requires conscientious practice these days. A run of the mill go application will not compile down to a tiny wasm with tinygo, without refactoring.

But it’s possible. Like I said, it’s on the outskirts, not downtown. Still within reach at least.

encoding/json sucks anyway, so using something 3rd party would likely be better option anyway
How so?
I've run into few issues:

1. Performance: lot of reflection and allocations, so not the fastest thing ever.

2. Case insensitivity: https://play.golang.org/p/LtwChO_tp0 this can be pretty unpleasant when it bites you

3. Unicode: It replaces invalid utf8 with unicode questions marks (dunno how exactly is the replacement character called) but does not provide any API to detect that it happened. So that can lead to silent corruption of the data.

I think it’s well demonstrated to be slow (ns/op) and wasteful (allocs/op) in benchmarks, especially in bigger payloads.

The HTTP library is similar.

Most 3rd party libs make some trade offs that are pretty reasonable to achieve superior perf, but it’s worth understanding them to make sure it’s best for your use case.

TinyGo will soon support encoding/json.
I tried tinygo on web assembly. I very quickly decided to use Rust instead.
Why not? If you use wasm for a small library need in your web app, then sure, go doesn't make sense. But if your whole app is coded in wasm like a game, then it's probably ok I guess as the app will be heavy to load regardless.
Ok, sure. If Go wasm is only useful for games... and I'm pretty sure game developers are picking C++ or Rust, then that's a severe limitation.
So tomorrow if the Go team doubled your binary size by adding "darkest bytes" area with garbage in it your response would be to question the users's inability to be ok with it and not the Go team's choice and reasoning?
If you think those are the alternatives, I don't think the rest of this discussion will go well.
Feels like the Go team could just...expose a flag?

Which they are loathe to do

Flags are juvenile. When I was young I used to play with flags. Now I set environment variables.
> Is there really someone out there who installs Cockroach (a global distributed auto-sharded database) and thinks twice about 172mb of disk space?

I think it is exactly this mindset that caused the current situation (that the 2/3 of the compiled binaries are useless to the users).

The alternative isn't avoiding fixing it. The alternative is a boring bug report on github with actual stakeholders discussing the best strategies and trade offs.

This is only on HN's homepage because of the langauge flame wars. It's a garbage post.

I agree 100% and I sincerely hope the author did that already. Just complaining about it to vent off one's frustration will accomplish nothing.
> (that the 2/3 of the compiled binaries are useless to the users).

That the bytes are not visible in the symbol table is inarguable, that they are useless is a highly contentious statement and very probably wrong.

It's possible, but it is the "who cares" mindset I object to. File sizes are important. Memory usage is important. CPU cycles are important. Many devs with powerful machines don't give a heck and in the end everybody has to pay, in different ways.