Hacker News new | ask | show | jobs
by olau 1730 days ago
> The problem here isn’t that the language has GC, it’s that memory usage was just not considered.

While I agree with the gist of what you're saying, I do think runtimes based on the we'll-clean-it-up-some-day GC paradigm makes it more important to consider memory allocation than less laissez-faire paradigms (like RAII or reference counting), contrary to how it's presented in the glamorous brochures.

2 comments

Put it this way: Each of the things mentioned in that post were errors that could just as easily have been made in Rust, and Rust would not necessarily have helped avoid. At best you can make a case for the errors being more explicit, but in my personal experience even that would be weak.

The last error in particular, using byte buffers instead of a streaming abstraction, is pervasive in programming. I don't know if Rust is necessarily any worse than Go's library environment for dealing with that problem but I doubt it's any better. By having io.Reader in the standard library from the beginning (and not because of any other particular virtue of the language, IMHO) it has had one of the best ecosystems for dealing with streams without having to manifest them as full bytes around [1].

It amounts to, the root problem is that they didn't have the problem they thought they have. Rust will blow the socks off the competition w.r.t. memory efficiency of lots of small objects, which is why it's so solid in the browser space. But that's not the problem they were having. Go's just fine where they seem to have ultimately ended up, stream processing things with transient per-object processing. Even if you do some allocation in the processing, the GC ends up not being a big deal because the runs end up scanning over not much memory not all that frequently. This is why Go is so popular in network servers. Could Rust do better? Yes. Absolutely, beyond a shadow of a doubt. But not enough to matter, in a lot of cases.

[1]: An expansion on that thought if you like: https://news.ycombinator.com/item?id=28368080

I think the Rust and Go stories with buffers vs. readers is pretty comparable. They both have good support for readers, and to-good support for reading whole messages into slices or Vec<u8>'s.
Good to hear. I hope it's something all new languages have going forward, because like I mentioned in my extended post it's almost all about setting the tone correctly early in the standard library & culture, rather than any sort of "language feature" Go had.

As mostly-a-network engineer it's a major pet peeve of mine when I have to step back into some environment where everything works with strings. I can just feel the memory screaming.

You mean just like XML-RPC and JSON-RPC (sorry REST), work?

Because the best way to contribute to global warming is to waste CPU cycles serializing and deserializing data structures into XML and JSON, and parsing them as well.

More importantly, GC'ed languages tend to use at least 2x the memory of un-GC'ed languages and have to deal with the consequences of GC-induced pauses and generally inferior native code interop. Whether that matters to you or not depends on your application. No one is going to use a GC'ed language in the Linux Kernel, but practically 100% of backend applications are written in GC'ed languages because the productivity benefits are of automatic memory management are massive.
I’m not really sure if that 2x figure is accurate. I’ve seen charts on both sides of this and a lot here depends on your programming language and the things it can optimize: with Linear/Affine types, I’m fairly sure Haskell could, in theory, eliminate GC deterministically from the critical sections of your code-base without forcing you to adopt manual memory management universally.

But, there’s just the fact that people writing real-time/near real-time systems do, in fact, choose GC languages and make it work: video games are one example with Minecraft and Unity being the major examples. But also HFT systems: Jane Street heavily uses Ocaml and other companies use Java/etc. with specialized GCs.

This is not even to mention the microbenchmarks that seem to indicate that Common Lisp and Java can match or exceed Rust for tasks like implementing lock-free hash maps and various other things https://programming-language-benchmarks.vercel.app/problem/s...

I am aware that you can hit really good latency targets with GC'ed languages, like in the video game and finance industry. Whenever I investigate examples, though, I find the devs have to go through a ton of effort to avoid memory allocations, and then I ask if using the GC'ed language was even worth it in the first place?

I'm actually fascinated with the idea of going off-heap in the hotspots of GC'ed languages to get better performance. Netty, for instance, relies on off-heap allocations to achieve better networking performance. But, once you do so, you start incurring the disadvantages of languages like C/C++, and it can get complicated mixing the two styles of code.

"Whenever I investigate examples, though, I find the devs have to go through a ton of effort to avoid memory allocations"

Yep, also the median dev in a GC'ed language is simply incapable of writing super efficient code in these languages because they rarely have to. You would have to bring in the best of the best people from those communities or put your existing devs through a pretty significant education process that is similar in difficulty to just learning/using Rust.

The resulting code will be very different to what typical code looks like in those languages, so the supposed homogeneity benefits of just writing fast C#/Java when it's needed are probably not quite true. You'd basically have to keep that project staffed up with these kinds of people and ensure they have very good Prod observability to ensure regressions don't appear.

Yes, and I think one important aspect to this is the necessary CI/CD changes needed to support these kinds of optimizations. If your performance targets are tight enough that you are making significant non-standard optimizations in your GC'ed language, you're probably going to want some automated performance regression testing in your deployment pipeline to ensure you don't ship something that falls down under load. In my experience, building and maintaining those pipeline components is not easy.
> … tasks like implementing lock-free hash maps…

Please be specific.

You pointed to spectral-norm, what does that have to do with lock-free hash maps?

The 2.java program seems to be 4x slower than the 7.rs program !

Look at 2.cl, though: the lisp solution is faster than everything except one c++ solution. (And, aside from the SIMD intrinsics, the lisp solution is fairly idiomatic)

I was referring to this with the lock-free hash maps: https://twitter.com/nodefunallowed/status/137196906733924761...

> I was referring to this with the lock-free hash maps…

Well thank you for providing an actual reference.

afaict from a twitter thread, "42nd At Threadmill" and "Luckless" are both Lisp re-implementations of the same Java hashtable code.

afaict the Rust sofware is not a re-implementation of that same Java hashtable code.

afaict that chart does not show any measurements of Java software, just Lisp and Rust.

So "… Java can match or exceed Rust …" seems to be based on nothing.

> Look at 2.cl, though…

So hand-coded AVX is hand-coded AVX in any language?

I mostly agree with what you're saying, but I'll also add that GC pauses are mostly a problem of yester-year unless you're either managing truly enormous amounts of memory or have hard real-time requirements (and even then it's debatable). Modern GCs, as seen in Go, Java 11+, .NET 4.5+ guarantee sub-millisecond pauses on terrabyte-large heaps (I believe the JS GC does as well, but I'm less sure).