Hacker News new | ask | show | jobs
by arsv 4145 days ago
> Where we're at in February 2015

Still producing 1.3M hello world executables.

I wonder if rewriting the linker from C to Go will be primarily rewriting, or maybe they will start fixing it somehow.

2 comments

http://golang.org/doc/faq#Why_is_my_trivial_program_such_a_l...

The linkers in the gc tool chain (5l, 6l, and 8l) do static linking. All Go binaries therefore include the Go run-time, along with the run-time type information necessary to support dynamic type checks, reflection, and even panic-time stack traces.

A simple C "hello, world" program compiled and linked statically using gcc on Linux is around 750 kB, including an implementation of printf. An equivalent Go program using fmt.Printf is around 1.9 MB, but that includes more powerful run-time support and type information.

> A simple C "hello, world" program compiled and linked statically using gcc on Linux is around 750 kB

diet gcc -o hello hello.c; strip hello

2280 bytes on my system.

There are reasons why using glibc results in executables so big, and why it is tolerated (kind of). Those reasons hardly apply to a new language being actively developed. Yet said language produces executables almost twice the size.

"Run-time support and type information", why is it linked into a an executable that never allocates memory and does no introspection of any kind?

> "Run-time support and type information", why is it linked into a an executable that never allocates memory and does no introspection of any kind?

fmt.Print does use reflection.

Besides, bickering over the size of hello world is pretty pointless; better to compare the size of programs that actually do something.

We do recognise that Go binaries can and should be smaller, but probably not as small as you might hope.

https://github.com/golang/go/issues/6853

It is a valid point considering the lack of dynamic linking. Go (as is) strongly suggests having lots of "small programs" compose a larger (modular) system on a node. So those 1.9MBs do add up.
This makes me laugh, considering at Google we regularly deploy statically linked C++ programs that are two orders of magnitude larger.

"You call that a big binary? THIS..." etc

I was generating 7-15Mb binaries out of Delphi in the late 90's (it had a similar kitchen sink approach) and it simply wasn't an issue then and it certainly isn't an issue now.

I'm actually racking my brain for a case where a 500kb vs 5Mb binary would be a deal breaker, outside of embedded stuff I can't think of much.

From what I hear you have the funds and resources to operate at that scale.
> Besides, bickering over the size of hello world is pretty pointless; better to compare the size of programs that actually do something.

It is not about the size of hello world executable, that is just a symptom. A code smell if you like. There is something badly broken in the dead code (or dead data) elimination area. And I hope that code is in fact dead, because if it is not, add code generation to the list of smelly things.

What I suspect I see here is a kind of C++ vtable problem built deep into the language design somewhere. And the reaction is, let's talk about large executables so that it would kinda become not so visible. Or maybe let's take a look at glibc, because glibc is definitely a paragon of clear design befitting a new language.

> fmt.Print does use reflection

There are two problems with this. The lesser one is why does it need reflection to print a string. The bigger one is why do I see about 600 reflect.* entries in the resulting ELF instead of a single one for the string type.

> The bigger one is why do I see about 600 reflect. entries in the resulting ELF instead of a single one for the string type.*

There is definitely work that can be done to improve dead code elimination in the Go tool chain. The transition to Go will make this easier to achieve.

> What I suspect I see here is a kind of C++ vtable problem built deep into the language design somewhere.

Don't suspect. Dig into the problem and make some informed commentary. Idly speculating on HN is just spreading FUD, and benefits no-one.

You should read about Go's implementation of interfaces. It's not the same as C++'s vtable issue. http://research.swtch.com/interfaces

> The lesser one is why does it need reflection to print a string.

It doesn't print just strings. It can print anything. http://golang.org/src/fmt/print.go?s=6420:6467#L221

After doing your research, you're welcome to submit a magical CL and PR that brings down the binary sizes, then you won't need to argue anymore.

> After doing your research, you're welcome to submit a magical CL and PR that brings down the binary sizes, then you won't need to argue anymore.

The original link is titled "The State of Go". The first thing I want to know about a state of a new language is whether it works. Then how well it works. Then, maybe, how to fix it and which VCS to use. There is an issue I think is within the range of these two questions, but it is not even mentioned there.

So to make life a bit easier for people who like me expect that issue to be discussed first, I posted a comment summarizing (in my opinion) the state of Go.

My thoughts on how to fix it are hardly relevant to the current state of Go.

> It doesn't print just strings. It can print anything.

The fact it is just a string should be statically (build-time) inferable in a strongly-typed language. Reflection, at least as I understand it, implies run-time type information. So the question does make sense. Yes, I understand why it may be needed for a particular implementation of printf, this is why I called it a lesser issue.

> A simple C "hello, world" program compiled and linked statically using gcc

Statically. Care to check "ldd hello" of your binary?

"not a dynamic executable"

In case you wonder, that's dietlibc which is typically built with no dynamic linking capabilities whatsover.

Are you sure that fmt doesn't allocate memory or do introspection?
Yeah, what an outrage this is.

1.3 megabytes. That's like $0.00004 USD worth of hard drive space.

Does the go team think we are all rich or something?

Plenty of CPUs out there with cache sizes smaller than that, or other things running on the machine that would also like to use the cache.
The size of the binary is completely irrelevant when considering if the code fits in the CPU cache. What's important is the size of code that actually executes. Go binaries have huge DWARF tables, and a lot of the code is dead code.

The code that actually executes is not bloated. It's not the most efficient code in the world, because the compilers don't have an optimizer as advanced as gcc's, but it's not unreasonably large.

I think the point he tried to make was that if only "hello world" produces a 1.3 megabytes executable, the file size of a fairly complicated program made in Go will be significantly larger than the same program implemented in another language.
Which makes no sense, anyway. The reason a Hello World program in Go is large is because it must include the baseline runtime support that is included in any Go program. A 10 line program won't be 13mb.
That's like trying to predict the cost of a flight by cost per mile using a quote from SFO to SJC as a baseline.
I'm looking at an executable of a medium program I'm working on, probably around 5k LOC of my own code, using tons of standard library modules and linking to probably another 10-20k LOC of 3rd party libraries, all debug symbols in place. Weighs 5MB. I don't think this program in C++ would weigh much less.

Anyway it can be improved, but to me there are far more important things to be improved about Go than the binary size of small programs.

A dependency parser (in other words, serious program) in Go, linking in some external dependencies and a C++ machine learning library (statically):

  % du -k eval/eval
  2040	eval/eval

  % strip eval/eval
  % du -k eval/eval
  1864	eval/eval
Don't just assume. Measure.
You're right - it will be significantly larger. But then again, it doesn't depend on your system having all of the necessary libraries of the correct version and the overhead of dynamic linking.

It's a tradeoff, and worth it in my opinion.

That's not a reasoned point then.