Hacker News new | ask | show | jobs
by WalterBright 1926 days ago
> Can you actually be faster than C?

Sure, in any language that provides more semantic information than C does. For example, D enables a "pointer to immutable data" type, while C does not. This can improve optimization.

On a pragmatic note, C makes it easy to use 0-terminated strings, and clumsy to use length-terminated strings. The natural result is people use 0-terminated strings.

0-terminated strings are inefficient because of the constant need to strlen them. When I look to optimize C, I find plenty of paydirt in all the strlen instances. D makes it easy to use length-terminated strings, and so people naturally prefer them.

3 comments

> For example, D enables a "pointer to immutable data" type, while C does not.

Wait, isn’t that what

  const int *x;
does? I.e. a pointer to a constant.
In C, the value pointed to by `x` cannot be changed via a write through `x`, but if there is another pointer to that value, it can be changed through that.

In D, `immutable(int)* x` cannot be changed by any reference to the value.

Isn't that what 'restrict' is for? It's a new type of footgun of course because the compiler doesn't detect if the value is actually accessed through another pointer, but by simply ignoring that possibility via 'restrict', the compiler should have the same optimization opportunities, no?
> but by simply ignoring that possibility via 'restrict',

If you deceive the compiler like that, it is entitled to hand you a broken executable :-/

Besides, I forgot to mention that D immutable data is inherently thread safe, no synchronization required. And you can have as many live references to it as you like.

Immutable data is part of D's support for functional programming.

Well yeah, sure. I wasn't picking on D, just pointing out that C offers an escape hatch, no matter how dangerous that might be ;)
restrict is so rarely used that clang has had open code generation bugs for years. Rust would like to use its non-aliasing guarantees to generate better code, but clang is unable to reliably generate correct code.

So even if you manage to reason correctly around `restrict` in C, you can't count on the compiler to translate your code correctly.

GCC also had bugs around restrict, but I don't know about their current status.

I haven't found the compiler ever optimizing something different because of constness. The various escape hatches means that it is really hard for the optimizing compiler to make certain assumptions, especially if it's connected to something with external linkage. A function with a const struct arg that you never take the address of? compiler emits memcpy!
You might be right but if it's just a technicality and in practice nobody uses restrict the way you said, would anyone care about the theoretical superiority of C? I'm sure some would, but they wouldn't be the majority.
restrict is used so rarely in C code that the Rust team hit a lot of bugs when they tried to give LLVM all the aliasing info the Rust compiler has.
You’d think that C could do any appropriate optimizations as well as D, since the C compiler (in theory) should know whether there can exist (in any given program) any writeable pointers to the same value or if there only can exist pointers to const.
Only inside the same compilation unit.
Doing it across the whole compilation requires LTCG, but that does exist as well. It just makes incremental builds a pain with large binaries.
Pointer arithmetic would surely make all value mutable. Or even simple array access.
Yes, but the compiler should know if there is any pointer arithmetic present in the code, and whether any such pointer arithmetic could, possibly, result in a pointer pointing to the const data.
If you point to writable memory (Not all programs reside in ram.)

    const int world = 42;
    const int const * const hello = &world ;
Apparently, you can be very const and `gcc -Wall -Wextra -std=c99` won't raise any complaints.
Firstly, isn’t that a syntax error? There’s a stray “const” in there. You probably meant

  const int * const hello = &world;
Secondly, what should the compiler complain about? You have a const int, and then a const pointer to const int, pointing to that first const int. What’s the problem?

Thirdly, the latest C version supported by GCC is “-std=c17”.

Nah const is fun

  int main()
  {
    const int world = 42;
    const const const int const const * const const hello = &world;

    return 0;
  }
Is a valid program

https://onlinegdb.com/SJcwJrRzd

Sure, but those extra “const” do not mean anything, and are apparently ignored by the compiler.
> In D, `immutable(int)* x` cannot be changed by any reference to the value.

even if I use a magnetic needle to change the bits in my RAM ? :)

There's an easy answer to that: using a magnetic needle to change the bits in your RAM violated the assumptions of the compiler, leading to undefined behaviour. What exactly will happen in that circumstance is just that: undefined.
To put that another way: hardware failures are beyond the scope of the compiler.
Maybe your compiler.
I think the point of the parent is that in principle you can always implement those optimizations by hand in C although it might of course be impractical.
If you go down this road then you can always drop down to assembler to be even faster than C.

I don't think this is a reasonable argument. Every Turing compatible language that gives you direct access to the metal, so to speak, provides you with the opportunity to implement these optimization by hand.

I think it's much better to look at average C code there. And then C has a tremendous advantage with their compiler support. C compilers have decades of optimization put into them. This will take a while for other languages to catch up to that.

The three D compilers use, the GCC backend, the LLVM backend, and the Digital Mars backend. All the decades of optimization in them are taken advantage of by D. If you write C-style code in D, you'll get the same code generated as if your wrote it in C.
But isn't that a consequence of D "mimicking" C to some extent?

Disclaimer, my compiler background was the Java compiler and its bytecode generation but I would expect both gcc and clang/llvm to have lots of optimization cruft hardcoded for how C programs are written. If you deviate from that, you get potentially less well optimized code.

> But isn't that a consequence of D "mimicking" C to some extent?

It's true that multi-language backends will inevitably cater to the semantics of the dominant language.

For a related example, I discovered that gdb doesn't pay much attention to the DWARF symbolic debug spec. It pays attention to the particular way gcc generates symbolic debug info. Other patterns, while compatible with the spec, don't work with gdb. Similar problems have happened with linkers.

It's a matter of "do as I do, not as I say".

Since I am the author of DMD's backend, I've been putting in optimizations that do cater to D, rather than just C and C++. For example, in the last few days I started putting in optimizations to take advantage of D's new "bottom" type:

https://github.com/dlang/dmd/pull/12245

> in principle you can always implement those optimizations by hand in C

Some things, like integral promotion rules, you can't get around. You may know that the promotion can be skipped in certain cases, but the compiler cannot know it.

Leading to eye-rolling problems like these: https://github.com/biojppm/rapidyaml/issues/40