Without knowing what your program did, it is hard to see what language features caused the difference. In any case, why was Java not a better option than Golang?
Go outputs a statically linked binary that just RUNS. Java needs a quite heavyweight runtime to be installed, that imposes quite a bit of startup overhead. That's just one reason - for short runtime CLI-type utilities it's not in the same ballpark.
Ok, so lets go with your use case. You can run basic java ("hello world") under 3MB with a startup time of about 0.1 second. So that is the true overhead if you really care about tight code. The default values are pretty large. Everything else (memory/startup) is added due to external libraries that are needed and additional memory as the program grows.
Given that "kkowalczyk" talked about 10K line programs, what applications are you thinking of that are 10K lines and cannot tolerate a 0.1 second/3 MB overhead. Would you restart java everytime? Note that even a simple helloworld c program has about 0.01second/0.5MB overhead.
I was going to say there's no way java programs boot in 0.1 seconds, but looked it up to be sure. Here's the results on my mac/i7, defaulted to server mode:
$ time java HelloWorld
Hello, world!
real 0m0.101s
Touche. 0.1 seconds is exactly right, at least on my setup. That said, javac is slow given the program is 5 lines of code:
$ time javac HelloWorld.java
real 0m0.511s
user 0m0.833s
sys 0m0.050s
And I'd ventured to guess that there must be something to the JIT being pretty slow for real-world applications, otherwise people wouldn't complain so frequently about it. Maybe aspects of JIT optimizations increase linearly-ish with the amount/complexity of code?
FWIW, we went with go at my work instead of java because our application is memory-intensive, and there's huge gains there in go over java.
A program that consists solely of println("Hello, World!"); is pretty stateless and trivial, there's not much that a JIT could do with this.
Somewhat anecdotally, I remember hearing Rich Hickey talking about how the JVM JIT loves stateless code. I'm pretty sure the vast majority of Java code in use is deeply stateful. I don't know how much of a difference this actually makes but it seemed like a relevant data point.
You are off by two orders of magnitude in the C case, at least with my trivial test case of writing hello world, and then a runner program that forks/execs/waitpids 10,000 times.
If you know what crt0.c does, you can see C is also pretty much the asymptote of what you're going to get from program startup, so it's a little silly to make the comparison.