| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Xurinos 4941 days ago

I guess I will be your essay-writing dissenter.

C is a high-level language with fewer features than many other languages, is not necessarily the engine behind other languages, has the problem of its programs poorly implementing a percentage of what other high level languages are capable of doing quickly and securely, and provides slow and troublesome memory allocation out-of-the-box. When comparing the speed of operations, people are rarely comparing apples to apples. And contrary to what is spattered on the boards, C is not an understandable, close-to-the-metal wrapper around assembly instructions (compilers have advanced quite a bit).

I love C. It does feel fast, and I get the illusion of being close to the metal. It was one of my first languages, holding a sentimental place in my heart. Very important things are written in it. It is a high-level language with some okay abstractions.

Is it underneath other high level languages? Maybe, if you mean that the compiler might be written in C in order to bootstrap the language. Of course, one could write the compiler in any language; it's all about translating programmer-friendly symbols into assembly or VM bytecode, right? And speed of compile is a different subject from speed of the compiled program.

But here is the gotcha on raw performance: Your large C program poorly implements a percentage of what other high level languages are capable of doing quickly and securely.

I once foolishly argued in favor of C's performance, saying that one could write a layer that supports all these nice features speedily, such as the data structures I will mention below as well as GC; by the time you do that, you might as well be using a different language. You probably implemented that layer poorly, compared to other languages with large communities pounding at and optimizing that layer. For example, when you implemented your "fast" list with the basic struct and next pointer, did you also implement the new-node creation in such a way as it still uses raw malloc(), as opposed to managing previously-malloced memory efficiently?

How many implementations of a basic list do we need in C? Super large integers? Fixed-point integers? Growable arrays? Lazy/infinite lists? Trees? Hash maps? Surely you don't think these other language designers said to themselves, "Let's support hash maps and make them slow." No, they came up with a fast standard, supported by their language, sometimes complete with various configuration options to make all the tradeoff decisions on making those data structures speed-efficient or memory-efficient for reads or writes. Others, of course, subscribe to a religion, er, a specific tradeoff, such as perl's approach to {}s ("There's more than one way to do it" ... unless you are dealing with hash tables).

What about all the wonderful memory management you can do in C? Aren't you closer to the metal that way, able to make basic memory allocation super speedy? Not really. This is part of the illusion. malloc() is slow enough that developers have rewritten versions of it several times. ROM-based MUDs, for example, manage their own memory, using an initial malloc, of course, but regularly using their own set of allocators and deallocators (free_string, str_dup, etc) on top of that allocation. There are these tricks and more in high level languages, including the sharing of partial structures (kinda like union but with more pointers and fewer bugs associated with those pointers), allowing for resource allocation strategies that can be "faster than C".

If the argument in favor of C's speed at the end of the day is, "When we write crappy programs with buffer overrun holes, memory leaks, and no error handling, it's super fast!", we are (1) not comparing apples to apples and (2) doing ourselves and our customers a grave disservice.

Let's be honest: C is no "closer to the metal" than other high level languages (http://news.ycombinator.com/item?id=3753530 and https://en.wikipedia.org/wiki/Low-level_programming_language...). The days of manually XORing to assign 0 to a variable are well behind us.

Note... I don't mean all high-level languages. There are many very slow implementations of these languages. Programmers are getting better at this stuff in modern implementations, gcc included. And it is fair to say that there are many things other languages can do that are, when you compare apples to apples, faster than C, especially when you factor in modern JIT compilation; and they also do some things slower than a similar function in C.

No, C isn't and won't be obsolete, not until people write popular OSes in other human-readable languages, complete with a body of excellent libraries. We operate in a world of legacy, working code.

Edit: Looks like we both must have not read the article before replying. The author goes over many of these points.

3 comments

haberman 4941 days ago

> Let's be honest: C is no "closer to the metal" than other high level languages

This is dead wrong, and your links do not support it. This is exactly the kind of statement that gets me grumpy.

Your link illustrates that an aggressive C optimizer can collapse a chunk of C code down to something smaller and simpler than the original code. This is true.

But what you said is that C is "no closer to the metal" than other high-level languages. Let's examine this assumption.

Take this C function:

  int plus2(int x) { return x + 2; }

You can compile this down into the following machine code on x86-64, which fully implements the function for all possible inputs and needs no supporting runtime of any kind:

  lea    eax,[rdi+0x2]
  ret

Now take the equivalent function in Python:

  def plus2(x):
    return x + 2

In CPython this compiles down to the following byte code:

  3           0 LOAD_FAST                0 (x)
              3 LOAD_CONST               1 (2)
              6 BINARY_ADD          
              7 RETURN_VALUE

Notice this is byte code and not machine code. Now suppose we wanted to compile this into machine code, could we get something out of it that looks like the assembly from our C function above? After all, you are claiming that C is "no closer to the metal" than other languages, so surely this must be possible?

The tricky part here is that BINARY_ADD opcode. BINARY_ADD has to handle the case where "x" is an object that implements an overloaded operator __add__(). And if it does, what then? Surely just a very few instructions of machine code will handle this case, if C is "no closer to the metal" than Python?

Well __add__() can be arbitrary Python code, so the only way you can implement this BINARY_ADD opcode is to implement an entire Python interpreter that runs __add__() in the overloaded operator case. And the Python interpreter is tens of thousands of lines of code in... C.

The end result is that writing the same function in C and Python is the difference between two machine code instructions and implementing an entire interpreter.

This is why I get grumpy when people deny that C is any different than other high-level languages. While this is a somewhat extreme case, you could make a similar argument about most operations that happen in other high-level languages; similar constructs will very frequently have less inherent cost in C.