| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jbb555 3544 days ago

Agree entirely with all this.

You know what you get with C and it just works as you expect always. I always feel refreshed after writing C.

There are perhaps a few very minor enhancements I'd suggest, but I'd be very reluctant to open the floodgates and ruin it.

3 comments

zigzigzag 3544 days ago

It always works as you expect? Really?

C is one of the few languages, along with C++, that revels in undefined behaviour. It can be very hard to reliably know what a C program will do if it's not written very carefully because there are so many constructs that look benign but which are technically wrong, and the compiler will mercilessly exploit in order to optimise your program into nonsense.

deagler 3544 days ago

This, While C and C++ are both very powerful languages and could still be considered the "industry standard"(loosely used) It is definitely not a language that will make your applications work as expected, I would say that C#/Java or higher level scripting languages like Lua or Javascript are languages that will let you make "code it and it works" applications

MaulingMonkey 3544 days ago

C and C++ are the only languages I regularly use wherein I need to resort to disassembly to debug the damn thing.

retro64 3544 days ago

I find this statement hard to imagine. I've been coding professionally in C++ for just shy of 20 years now (and in C for a few years before that) and have only resorted to assembly when I needed to actually use assembly for performance reasons.

Actually - not completely true. I do like to look at the assembly from time to time for other reasons, but this is rare.

MaulingMonkey 3543 days ago

Even when I have performance to worry about, intrinsics are usually an option - possibly even a superior option - when writing performance sensitive code. But here are some of the cases where I resort to looking at program disassembly - I'll note a theme here of debugging optimized code (because who doesn't love heisenbugs):

1) Diagnosing a crash which turned out to be the result of a 1 byte vtable pointer corruption bug in an older codebase, which turned out to be a bad static_cast in relatively removed code (a good case for boost::polymorphic_downcast!). Simply understanding which pointer was bad in the first place required looking at the disassembly - when you can't rely on your debugger's results thanks to optimization.

2) Figuring out the actual values of variables in crashes and crash dumps of optimized builds to properly root cause a bug, when the debugger gets confused - or simply aggressively inlined and reordered everything so aggressively that there's no sensible values to even display (so, most crash dumps.)

3) Noticing when the optimizer has reordered code "unexpectedly", alerting me to the fact that supposedly thread safe code is in fact nowhere nearly remotely safe and is in fact missing many memory barriers (possibly because their portable macros "helpfully" defaulted to a noop on whatever new and previously unrecognized platform I'm porting to.)

4) Noticing when the optimizer has removed or rewritten code in an "incorrect" manner, helping me debug code that would've worked if it hadn't technically invoked undefined behavior, so I can a) fix it, b) attempt to explain to my coworker that, yes, it's really undefined behavior, and yes, it's actually a problem (typically with a combination of citing the standard and linking INVALID WONTFIX-ed "bugs" in some compiler's bug database), c) be reasonably certain I've actually found the real root cause of a bug and fixed it.

Now, yes, I'll admit this isn't 100% of my debugging sessions. And perhaps I'm an outlier. My coworkers generally learn that I can (eventually) tackle pretty much any weird bug they might be struggling with and that I'm happy to help. All the porting I've done reopens a whole codebase's worth of wounds - latent undefined behavior that another compiler's optimizer didn't take advantage of.

But on the other hand, I've been lucky enough to never encounter a codegen bug in all the compiler and linker bugs I've found. So far. That I know of. And while "rare" by incidence, these are the debugging sessions that can eat weeks at a time for a single bug, when sufficiently nasty and novel.

kbart 3544 days ago

"It always works as you expect? Really?"

Well, it works as defined in standards. It's just not every programmer knows what to expect. C is simple, yet powerful language and with powers comes responsibility. It's not like you can throw some libraries/modules/objects (or whatever it is in other, safe languages) together, upload to server and call it a day -- static testing, debugging, unit testing is a vital part of any semi serious C project.

JoeAltmaier 3544 days ago

I hear this, but I've rarely experienced it. Maybe I unconsciously avoid such cases with long use. I started with assembler, and debug C/C++ with disassembly turned on, so maybe that's why.

ArkyBeagle 3543 days ago

It's just really not that hard to avoid undefined behavior. Really. The worst is arguably integer overflow, and that's not even all that challenging.

0xfeba 3544 days ago

C++ is far more prone to undefined behaviour and even compilation across systems with the STL.

tarancato 3544 days ago

Undefined behaviour is not as prevalent in real world code as reading articles from HN might make you think.

rwmj 3544 days ago

This is a very naive statement. Sure there are a handful of good companies that enable every single compiler warning, and fix those warnings, and then run the code through Coverity, and fix all those problems too. Almost no one else does. The amount of terrible C in the real world is enormous.

w0utert 3544 days ago

>> Sure there are a handful of good companies that enable every single compiler warning.

You think so? Every company I've worked for, or that I've known people that worked there, always enabled -Wall for their C and C++ code. Most OSS software compiles with all warnings enabled.

I think the issue with undefined behavior in C/C++ is extremely overblown, aside from fun academic examples like 'what does i++++i++ evaluate to' there isn't actually all that much undefined behavior or gotchas in C/C++. I would say there are less, compared to other languages I know.

rwmj 3544 days ago

Signed overflow problems are everywhere, even in carefully written code. Using 'int' instead of a more specific type is a code smell. Security code which presumes that because you wrote ptr != NULL, that the check is actually carried out. Code that does type punning. Code that doesn't know about aliasing. It goes on and on.

You need to know that the problem exists in order to know that you have a problem. There are many C programmers who learned C back in the 1980s who don't even realize these are issues.

tarancato 3544 days ago

I'd say things have changed quite a bit since format string bugs...

majewsky 3544 days ago

> always enabled -Wall for their C and C++ code

Of course you want -Wall -Wextra -Werror -pedantic. ;)

david-given 3544 days ago

...but please, for the love of Mike, don't ship source code with -Werror.

There's nothing like the experience of trying to fix somebody else's code which compiled fine on gcc version 8.97 but which now fails to compile on gcc version 8.98 because the new compiler has some new warnings, which it's now treating as errors, and now fails to compile.

...and you've got stuff to do, and the program isn't even broken.

mitchty 3544 days ago

Or if you don't have to use gcc, just -Weverything in clang.

GFK_of_xmaspast 3544 days ago

I used to work with a guy who would regularly get upset about the idea letting the compiler return warnings because he knew better and didn't want to be bothered with it.

Last I checked he has a couple hundred points on the hacker news internet forums.

Also just last week I found and reported some undefined behavior in a major c++ package that's used by almost every player in as many as several industries. I don't expect it will ever make any difference in production, but it still snuck in.

kbart 3544 days ago

"The amount of terrible C in the real world is enormous."

I'm sure you could say that about pretty much any programming language: "The amount of terrible X in the real world is enormous". There are also plenty of clean, nice, safe C code around (and any other language), there's no need to over-generalize ("Almost no one else does").

netheril96 3544 days ago

> I'm sure you could say that about pretty much any programming language: "The amount of terrible X in the real world is enormous".

But the damage is far greater in C. In other languages you won't have arbitrary code execution or privilege escalation just because the programmer is not careful. Nor will there be, in other languages, so many nondeterministic bugs that show up once in a blue moon.

falcolas 3544 days ago

> In other languages you won't have arbitrary code execution or privilege escalation just because the programmer is not careful.

Sure you do. Remember the YAML fiasco with Ruby? How about the thousand-and-one RCE issues with PHP? eval isn't evil for no reason.

kbart 3544 days ago

"In other languages you won't have arbitrary code execution or privilege escalation just because the programmer is not careful"

No, it's possible to make system insecure with pretty much any language if programmer is not careful. SQL injection, cross-site scripting, cross-site request forgery and the list goes on..

0xfeba 3544 days ago

Yeah, I do web development. I've worked with javascript, PHP, and, sigh, classic ASP.

There's bad code everywhere. Some languages make it a bit easier, but it's really not the languages fault.

majewsky 3544 days ago

https://en.wikipedia.org/wiki/Sturgeon's_law

cwyers 3544 days ago

There are very few programming languages where the total lines of code written is larger than the amount of bad C code written.

kbart 3544 days ago

There are very programming languages where the total lines of code written is even comparable to C, so of course there is more of bad code too.

MaulingMonkey 3544 days ago

You're right - it's far more prevalent, if the bug database for basically every C or C++ codebase I've ever seen is any indication.

ArkyBeagle 3543 days ago

it's a heatmap.

CountSessine 3543 days ago

There are some very interesting surveys and studies that suggest otherwise. Undefined behaviour due to integer overflow seems to be very common.

http://www.cs.utah.edu/~peterlee/papers/tosem15.pdf

akvadrako 3544 days ago

I would tend to disagree. For example, casting void * to other pointer types is undefined, but this construct is often used, for example:

  void func(void *ptr)
    {
    uint32_t *ip = ptr;
    ptr[0] = 123;
    }

zosima 3544 days ago

This is wrong. Casting to and from void * is defined for all pointer types except function pointers, according to the standard.

Otherwise most every assignment after call to malloc would be undefined.

akvadrako 3544 days ago

Yes, you are right, void * is an exception. However, any other pointer cannot be reliably casted:

From C1X, section 6.3.2.3:

"A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined."

Though that is quite odd, since any pointer can be converted to void* , which only needs alignment to the char type. So converting from x* -> y* is undefined, but x* -> void* -> y* is defined.

planteen 3544 days ago

That might not necessarily work. If you have something like:

>> uint8_t x[100];

>> uint32_t *y = &x[1];

And then dereference y, most RISC architectures will trap on the unaligned access. It doesn't matter if there is an intermediate void pointer or not.

MaulingMonkey 3544 days ago

To try and give some actual examples:

Undefined:

  int64_t a = 42;
  void* p = &a;
  int32_t* i = p;
  printf("%i", *i);

Implementation defined, as type punning to char is legal (allowing the implementation of memcpy):

  int64_t a = 42;
  void* p = &a;
  char* ch = p;
  printf("%c", *ch);

Exercise left to the reader: Implement a "fast" memcpy (e.g. one that will copy more than 1 byte at a time for large copies, as your standard library implementation likely does) without violating strict aliasing rules.

Koshkin 3544 days ago

If you try and hit your finger with a hammer, your subsequent behavior is undefined. Please do not do that.

akvadrako 3544 days ago

Where in the standard does it say your first example in undefined?

MaulingMonkey 3544 days ago

Since I don't have a copy of the C standard handy, I'll reference this which covers the relevant sections of C++03, C++11, C99, and C11: http://stackoverflow.com/a/7005988/953531 . Quoting the C99 version bellow (§6.5 ¶7):

  An object shall have its stored value accessed only by an lvalue expression that has one of the following types 73) or 88):

  * a type compatible with the effective type of the object,
  * a qualiﬁed version of a type compatible with the effective type of the object,
  * a type that is the signed or unsigned type corresponding to the effective type of the object,
  * a type that is the signed or unsigned type corresponding to a qualiﬁed version of the effective type of the object,
  * an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  * a character type.

  73) or 88) The intent of this list is to specify those circumstances in which an object may or may not be aliased.

Bullet 6 is what allows the second sample to have defined behavior. For the first sample, unless I'm seriously mistaken, int32_t isn't considered "a type compatible with" int64_t. Bullet 2 talks of "qualified" versions of types - I believe this is referencing const/volatile qualified types. Bullet 3 apparently allows you to type pun (unsigned int) to (signed int) or vicea versa? Which is an interesting bit of new trivia to me. Bullet 4 is much of the same, bullet 5 requires a nonexistant union, and bullet 6 requests a character type.

koorogi 3544 days ago

Casting pointers is well defined. What is undefined behavior is to dereference a pointer whose type does not match the pointed-to object.

akvadrako 3544 days ago

That's not correct in general; casting pointers is possibly undefined. However, it does seem a made a mistake trying to use void * as an example.

radmuzom 3544 days ago

One of the best things about C is it is not upgraded frequently like other languages. So there are no frequent <language> <x.y> released posts here like we see for other languages.

creshal 3544 days ago

On the downside, when it does get updated, compilers take ages to implement the new features, and in the meantime make up busywork like "let's break OS kernels or crypto code to get faster in some random benchmark nobody cares about!"

dom0 3544 days ago

Don't need C updates for that, a GCC upgrade is quite sufficient! :(

joosters 3544 days ago

gcc is generally used as a testing ground for new C/C++ features. So in most cases, the compiler supports new features before they are 'officially released' into the language.

It's the complete opposite of waiting for a feature to appear in the compiler.

creshal 3544 days ago

GCC didn't get sort-of-complete C11 features until 4.9 (2014) and still omits (largely useless) optional features.

bkjsbkjdnf 3544 days ago

> You know what you get with C and it just works as you expect always.

Every language works just as you expect if you have the right expectations.