| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by WalterBright 1973 days ago

> Can you actually be faster than C?

Sure, in any language that provides more semantic information than C does. For example, D enables a "pointer to immutable data" type, while C does not. This can improve optimization.

On a pragmatic note, C makes it easy to use 0-terminated strings, and clumsy to use length-terminated strings. The natural result is people use 0-terminated strings.

0-terminated strings are inefficient because of the constant need to strlen them. When I look to optimize C, I find plenty of paydirt in all the strlen instances. D makes it easy to use length-terminated strings, and so people naturally prefer them.

3 comments

teddyh 1973 days ago

> For example, D enables a "pointer to immutable data" type, while C does not.

Wait, isn’t that what

  const int *x;

does? I.e. a pointer to a constant.

WalterBright 1973 days ago

In C, the value pointed to by `x` cannot be changed via a write through `x`, but if there is another pointer to that value, it can be changed through that.

In D, `immutable(int)* x` cannot be changed by any reference to the value.

flohofwoe 1973 days ago

Isn't that what 'restrict' is for? It's a new type of footgun of course because the compiler doesn't detect if the value is actually accessed through another pointer, but by simply ignoring that possibility via 'restrict', the compiler should have the same optimization opportunities, no?

WalterBright 1973 days ago

> but by simply ignoring that possibility via 'restrict',

If you deceive the compiler like that, it is entitled to hand you a broken executable :-/

Besides, I forgot to mention that D immutable data is inherently thread safe, no synchronization required. And you can have as many live references to it as you like.

Immutable data is part of D's support for functional programming.

flohofwoe 1973 days ago

Well yeah, sure. I wasn't picking on D, just pointing out that C offers an escape hatch, no matter how dangerous that might be ;)

Omin 1973 days ago

restrict is so rarely used that clang has had open code generation bugs for years. Rust would like to use its non-aliasing guarantees to generate better code, but clang is unable to reliably generate correct code.

So even if you manage to reason correctly around `restrict` in C, you can't count on the compiler to translate your code correctly.

GCC also had bugs around restrict, but I don't know about their current status.

yvdriess 1973 days ago

I haven't found the compiler ever optimizing something different because of constness. The various escape hatches means that it is really hard for the optimizing compiler to make certain assumptions, especially if it's connected to something with external linkage. A function with a const struct arg that you never take the address of? compiler emits memcpy!

imtringued 1973 days ago

You might be right but if it's just a technicality and in practice nobody uses restrict the way you said, would anyone care about the theoretical superiority of C? I'm sure some would, but they wouldn't be the majority.

adrianN 1973 days ago

restrict is used so rarely in C code that the Rust team hit a lot of bugs when they tried to give LLVM all the aliasing info the Rust compiler has.

teddyh 1973 days ago

You’d think that C could do any appropriate optimizations as well as D, since the C compiler (in theory) should know whether there can exist (in any given program) any writeable pointers to the same value or if there only can exist pointers to const.

marton78 1973 days ago

Only inside the same compilation unit.

ygra 1973 days ago

Doing it across the whole compilation requires LTCG, but that does exist as well. It just makes incremental builds a pain with large binaries.

innocenat 1973 days ago

Pointer arithmetic would surely make all value mutable. Or even simple array access.

teddyh 1972 days ago

Yes, but the compiler should know if there is any pointer arithmetic present in the code, and whether any such pointer arithmetic could, possibly, result in a pointer pointing to the const data.

makapuf 1973 days ago

If you point to writable memory (Not all programs reside in ram.)

knome 1973 days ago

    const int world = 42;
    const int const * const hello = &world ;

Apparently, you can be very const and `gcc -Wall -Wextra -std=c99` won't raise any complaints.

teddyh 1973 days ago

Firstly, isn’t that a syntax error? There’s a stray “const” in there. You probably meant

  const int * const hello = &world;

Secondly, what should the compiler complain about? You have a const int, and then a const pointer to const int, pointing to that first const int. What’s the problem?

Thirdly, the latest C version supported by GCC is “-std=c17”.

tick_tock_tick 1973 days ago

Nah const is fun

  int main()
  {
    const int world = 42;
    const const const int const const * const const hello = &world;

    return 0;
  }

Is a valid program

https://onlinegdb.com/SJcwJrRzd

teddyh 1973 days ago

Sure, but those extra “const” do not mean anything, and are apparently ignored by the compiler.

jcelerier 1973 days ago

> In D, `immutable(int)* x` cannot be changed by any reference to the value.

even if I use a magnetic needle to change the bits in my RAM ? :)

petertodd 1973 days ago

There's an easy answer to that: using a magnetic needle to change the bits in your RAM violated the assumptions of the compiler, leading to undefined behaviour. What exactly will happen in that circumstance is just that: undefined.

MaxBarraclough 1973 days ago

To put that another way: hardware failures are beyond the scope of the compiler.

wizzwizz4 1973 days ago

Maybe your compiler.

gpderetta 1973 days ago

I think the point of the parent is that in principle you can always implement those optimizations by hand in C although it might of course be impractical.

bhaak 1973 days ago

If you go down this road then you can always drop down to assembler to be even faster than C.

I don't think this is a reasonable argument. Every Turing compatible language that gives you direct access to the metal, so to speak, provides you with the opportunity to implement these optimization by hand.

I think it's much better to look at average C code there. And then C has a tremendous advantage with their compiler support. C compilers have decades of optimization put into them. This will take a while for other languages to catch up to that.

WalterBright 1973 days ago

The three D compilers use, the GCC backend, the LLVM backend, and the Digital Mars backend. All the decades of optimization in them are taken advantage of by D. If you write C-style code in D, you'll get the same code generated as if your wrote it in C.

bhaak 1973 days ago

But isn't that a consequence of D "mimicking" C to some extent?

Disclaimer, my compiler background was the Java compiler and its bytecode generation but I would expect both gcc and clang/llvm to have lots of optimization cruft hardcoded for how C programs are written. If you deviate from that, you get potentially less well optimized code.

WalterBright 1972 days ago

> But isn't that a consequence of D "mimicking" C to some extent?

It's true that multi-language backends will inevitably cater to the semantics of the dominant language.

For a related example, I discovered that gdb doesn't pay much attention to the DWARF symbolic debug spec. It pays attention to the particular way gcc generates symbolic debug info. Other patterns, while compatible with the spec, don't work with gdb. Similar problems have happened with linkers.

It's a matter of "do as I do, not as I say".

Since I am the author of DMD's backend, I've been putting in optimizations that do cater to D, rather than just C and C++. For example, in the last few days I started putting in optimizations to take advantage of D's new "bottom" type:

https://github.com/dlang/dmd/pull/12245

WalterBright 1972 days ago

> in principle you can always implement those optimizations by hand in C

Some things, like integral promotion rules, you can't get around. You may know that the promotion can be skipped in certain cases, but the compiler cannot know it.

lenkite 1973 days ago

Leading to eye-rolling problems like these: https://github.com/biojppm/rapidyaml/issues/40