Hacker News new | ask | show | jobs
by sprw121 4069 days ago
D is a awesome language to work with, it's got many useful language features that make the activity of code writing a pleasure. - I hope this criticism is taking constructively.

However not a beat a dead horse, but if you want to process more than a trickle of data in it you run into problems with the GC really quickly. I really feel the language would be better off without the GC.

These are not the same issues address by the JSON compiler post or w/e that surfaces a couple months ago.

From what I can tell, theres a global lock around everything in the GC, including allocations. In a multi-core world, this just simply doesn't work and it one of the major pain point of the language. I write data-intensive processing on high core count machines (32), and have had to resort to 0-allocation strategies, or in map-reduce contexts actually sharding at the process level, writing the results to disk, and then running a reducer process over the results.

You can write performant D code, but you give up large amounts of code safety. It's essentially just whatever you'd write in C++ without the ownership semantics it gives you.

You can't have a core part of your language being an essentially unavoidable massive point of contention.

I've literally seen 20x or more speedups in multithreaded cases just by making sure I reuse every buffer rather than create new ones.

I feel this is really holding back an otherwise great language to work in.

This is discussed in the reddit thread in more detail.

4 comments

> I've literally seen 20x or more speedups in multithreaded cases just by making sure I reuse every buffer rather than create new ones.

Reuse rather than free and reallocate is a core practice whenever you feel the need for speed, regardless of the memory allocate strategy used.

For some very fast D code:

https://github.com/facebook/warp

Minimizing the amount of heap memory allocated is a core strategy.

Thanks for responding. I've seen this code used an an example before actually.

Not that buffer reuse/avoiding heap allocations was unknown to me, it was just surprising to see this in an application that spent most of it's time waiting on the network.

I will say as a positive point that the equivalent D code to a C++ implementation was much cleaner due to build in array slicing among other things.

There are other areas where heap allocations are not so avoidable however (some hashmaps, some std's). My main point is that like at all languages, heap allocations are slow, but here they bear an unnecessarily large contention factor.

I really like D so I hope this is helpful feedback.

"Reuse rather than free and reallocate is a core practice whenever you feel the need for speed"

The problem with this conventional wisdom is that it defeats other optimisations. If you free, use, and deallocate a block of memory within a compilation unit, a compiler can transparently allocate it on the stack, or even keep it in registers

If you use a free-list the objects are always escaped, and the same optimisations can't be applied to them.

This is particularly true for small temporary objects, like an intermediate vector (as in coordinates) object.

I got a speedup of around 13,32 on a 16 core Opteron machine a few years ago, using a simple parallel foreach on a N Body problem simulator. I didn't do anything special for it.

I thought using SIMD to accelerate it more on the integrator and acceleration calculation, inside of the parallel for.

Unless you are creating or destroying bodies then your n-body implementation won't need to allocate any memory will it? So is it relevant to what he's talking about?
As aside:

I just discovered that the language authors are converting the std library (phobos) to be "no GC".

It's a little more nuanced than that. Phobos is moving towards a more pipeline style, of which http://dlang.org/phobos/std_algorithm.html is a standout example.

Algorithms, by and large, do not need to dynamically allocate memory. By converting more of Phobos to being algorithms, they become agnostic to whatever allocation method is used. Allocation strategy becomes a high level decision, rather than a low level one.

While true, it is more a consequence of quality not of having a GC.

D with an Hotspot comparable GC quality would be quite good.