Hacker News new | ask | show | jobs
by parhamn 1888 days ago
The article says they don't really care about initialization time though, which is right.

Remember: cockroachdb is always synchronizing data across the cluster, that 175mb of ingress to start up a DB node, probably pales in comparison to data synchronization/relocations that happen on a cluster. Which is why worrying about ingress/egress costs over binary size is nonsense here too.

The bandwidth you need to run a distributed database cluster could download 172mb binary in milliseconds. If your node initiation time for DB failovers needs anything faster, you're doing something wrong.

There are stakeholders to this problem, Cockroach probably isn't one.

1 comments

For production, yes. But it also effects startup and download time on developer machines. Want to have multiple versions installed? Now it takes more space. Takes longer to download on 4g while on the road or on crappy corporate/conference WiFi, etc .

In the end this all ends up because it is for all go binaries. I’ve come appreciate attention for leanness because in the end it does add up.

Here is a question: imagine you could double performance of cockroa hDB by making the executable 2000MB - every db admin would make that choice
Yes, if it would double the performance in many cassis, then the size would have a significant benefit.

Here we are taking about large binaries without apparent benefit and a drive to keep binaries as lean as possible.

This is effectively what is happening btw; the crdb binary went from 80MB to 200MB in the same time it took to make it twice as fast. The % growth in size is not a problem on its own; it's more the % size attributed to the program vs. the % size attributed to unclear purposes, that's a problem.
What's the use case where a 200MB binary size is a problem?
The use case where said binary is shipped to GCE instances hundreds/thousands times per day, for stress testing and unit testing of cockroachdb.
If most of the bytes are indeed cruft, can't you just send binary diffs around? I imagine they should change that much between compilations? It seems to be an issue for db developers but not for users.