Hacker News new | ask | show | jobs
by lifthrasiir 842 days ago
While I agree with last two paragraphs, C is not good even for that purpose because it doesn't give any tool to manage them. An effective C education should really be paired with various static analyses and formal verification strategies.
3 comments

> While I agree with last two paragraphs, C is not good even for that purpose because it doesn't give any tool to manage them.

That is the reason it should be used as a learning tool. So that you know the nitty gritty details without anyone "managing" it away from you.

Then learn an assembly language instead, because C also has a fair amount of bookkeeping hidden behind the scene.

C is typically described as a "low-level" programming language, where the "low-level" normally refers to the supposed distance from the language to the actual hardware. But as many incidents with UB demonstrate, this distance is still quite larger than expected. I think there is another sense of the word "low-level", which is the amount of abstractions that are either built into the language or allowed for users, and C doesn't have a lot of them.

Combined together they represent related but distinct axes of controllability, and C only achieves a modest level of controllability in one axis but not in another. The ideal language with controllability in comparison should minimize the distance to the machine and maximize an amount of abstraction to control anything below the language instead.

> Then learn an assembly language instead

Learn assembly TOO, not instead. I did, as part of computer architecture course. Very valuable. I think you should learn everything from transistor level up if you want to do serious programming. I don't think you need to actually use it, but it's sometimes very handy to know those things.

It's important to set a correct expectation for what you learn. C is indeed important to learn because of all legacy and current code bases, but it's just one of possible language choices for learning computer architecture, and learning C alone doesn't give you a relevant knowledge.
> C also has a fair amount of bookkeeping hidden behind the scene

Hosted C does in the form of the standard library. C does have a freestanding variant though and its book keeping is generally limited to knowing struct member offsets.

This is mostly only true for modern optimising C compilers. If you take a simple C compiler from the 90's, or disable optimisations in a modern compiler, there's a near 1:1 relationship between the C code and compiler output.
Yes, but I don't think you necessarily argue for using those simpler compilers today.
Of course not, but it might explain where the idea comes from that "C is high level assembly". Apart from that, I think every programmer should play around with godbolt at least once in a while to get an idea how the high level source code maps to machine code. Even with optimizations enabled, the output of C code is usually much closer to the source (e.g. more "recognisable") than (for instance) highly abstracted C++ or Rust code.
> Even with optimizations enabled, the output of C code is usually much closer to the source (e.g. more "recognisable") than (for instance) highly abstracted C++ or Rust code.

That's usually true, but you can write C++ and Rust codes that more closely map to machine code as well. Most C code exhibits that only because you can't have enough abstractions to disrupt that mapping. It is good when you do need that kind of correspondence, but most applications rarely need them, and even performance-sensitive applications don't need them all the time. C does give you a knob, but that knob is stuck in a lower but not lowest position.

Well, there are such tools for C, but wouldn't using them be detrimental in this context? Think, like using a debugger vs. trying to wrap the execution in one's mind: I'not saying that one shouldn't use debuggers, but not using one has benefits, as a teaching device.

Like running in a weight vest.

Edit: ah, perhaps you meant, in addition to using raw C, one should also learn how to use such static analyzers & cie

> Edit: ah, perhaps you meant, in addition to using raw C, one should also learn how to use such static analyzers & cie

Exactly. Sorry for my unclear wording.

As the author notes, to know what C code does you need to run it. A good debgger is a C programmers best friend.
Not necessarily:

> A year or two after I'd joined the Labs, I [Rob Pike] was pair programming with Ken Thompson on an on-the-fly compiler for a little interactive graphics language designed by Gerard Holzmann. I was the faster typist, so I was at the keyboard and Ken was standing behind me as we programmed. We were working fast, and things broke, often visibly—it was a graphics language, after all. When something went wrong, I'd reflexively start to dig in to the problem, examining stack traces, sticking in print statements, invoking a debugger, and so on. But Ken would just stand and think, ignoring me and the code we'd just written. After a while I noticed a pattern: Ken would often understand the problem before I would, and would suddenly announce, "I know what's wrong." He was usually correct. I realized that Ken was building a mental model of the code and when something broke it was an error in the model. By thinking about how that problem could happen, he'd intuit where the model was wrong or where our code must not be satisfying the model.

A debugger can only tell you what the executable does: the program compiled with a particular compiler on a particular platform, with particular code generation options.

It can tell you that i = i++ increased i by 2, for instance. That might not even be true of another instance of i = i++ in the same object file being debugged.

There is no substitute for knowing what the C will do before it is run.

> While I agree with last two paragraphs, C is not good even for that purpose because it doesn't give any tool to manage them.

The "purpose" here is to instill a sense of paranoia around error handling, so that the resulting program from the PoV of the user appears to handle whatever bizarre combination of inputs the user put in.

It's the difference between telling the user "Failed to open foo.txt (file not found). Do you want to create it?" and ending the program with a 100-line stack trace with "FileNotFoundException" buried somewhere in there.

As the article says, checked exceptions are not the solution here.

I've literally never seen, in a professional working environment, exception languages (Java, C#, Python, etc) actually check that the file they tried opening was actually opened, and if it wasn't, directing the user with a sensible message that allowed the user/operator to fix the problem.

In C, the very first time that you fail to check that `fopen` returned non-NULL, the program crashes. Then you check if it returned NULL, and need to put something in the handling code, so you look at what `errno` has, or convert `errno` to a string.

I will bet good money that you could grab the nearest C#/Java/Python/JS/etc engineer to you, ask them to find the most recent code they wrote that opened and read/wrote a file, and you'll find that there is literally no code to direct the user if the file-open failed. The default runtime handler steps in and vomits a stack trace onto the screen.

In C, you are forced to perform the NULL-check, or crash. Sure, many devs are simply going to have a no-op code-path for the error cases, doing `if ((inf = fopen(...)) != NULL) { DoSomethingWith(inf);}`, but proceeding on success is a code-smell and easy to visually spot as an error.

The exception languages make it virtually impossible to spot the code-smell of handled (or improperly handled) exceptions, and make it easy because the dev can just read the stack trace, create the file needed, and proceed with programming the rest of the app.

What a good program must do when a file open failure is encountered is direct the user in some way that they can fix the problem. For example "file doesn't exist. Create it [button], Choose a file [button]", or "Permission denied. Try running as a different user.", or "File is locked. Are you already running $PROGRAM?", or "$FOO is a directory. Specify a file.".

[EDIT: Yes, seeing a stack trace in a shipped product is one of my personal bugbears that I feel very strongly about. If it's a stack trace for an expected error (like failure to open/read/write a file) I absolutelydo get annoyed by this public display of laziness. And yes, this is one of those hills I'll die on before I leave it!]

Even the simplest c/line programs annoy me no end when the application simply dumps a stack trace to the screen. Sure, I can dig into it, but the average user is going to ask for help on stackoverflow, just to figure out what must be done to fix the error.

> As the article says, checked exceptions are not the solution here.

Java had a very bad model of checked exceptions (the OP was written in 2008). A correct way is to make it a part of the type system, though it doesn't have to be a sum type like Rust, and make any error-related code path as convenient as possible to use.

> In C, you are forced to perform the NULL-check, or crash.

You don't necessarily crash if you failed to perform a NULL check! That's literally the single biggest problem with C's undefined behaviors. C looks like forcing checks only because most programmers do understand crashes are bad, so they do prepare for trivial or demonstrated crashes. But that's not guaranteed, and they can't easily prepare for non-crash failures without additional tools.

> You don't necessarily crash if you failed to perform a NULL check!

In the case of using the result from `fopen`, I don't know of a platform where a dereferencing of NULL (which happens in a separate translation unit, which is already compiled and linked, and will not be subject to LTO and other optimisations) within the various read/write/seek/tell functions doesn't result in an immediate crash.

I fully admit that this is applicable only to this particular example, and to all the functions in the stdlib. Everywhere else (code you wrote, that will be subject to aggressive optimisation, for example), you may not necessarily crash on a NULL dereference.

In the sense of instilling a sense of paranoia, the relative frequency of crashing due to UB is high enough that it does develop the sense of paranoia.

That's a really huge asterisk that wasn't present in your original claim ;-)

> In the sense of instilling a sense of paranoia, the relative frequency of crashing due to UB is high enough that it does develop the sense of paranoia.

Paranoia isn't a cure however. A good programmer will and arguably should develop an instinct to avoid C for most cases instead. I too have written tons of C codes, and yet I feel really uneasy about using C at all. I can't believe that C merely induces the sense of paranoia.

> I don't know of a platform where a dereferencing of NULL (which happens in a separate translation unit, which is already compiled and linked, and will not be subject to LTO and other optimisations) within the various read/write/seek/tell functions doesn't result in an immediate crash.

Although in a very different content, I have seen "dereferencing" a null pointer in C++ not crash immediately, if you dereference it to call a nonvirtual class member function, e.g,

    t->foo();
Depending on how this gets compiled and the implementation of `foo()`, the segfault may not come at the line above, where technically `t` is being dereferenced. It may come inside `foo`, or somewhere further down the call chain. The resulting crash may not even manifest as a segfault.
> Although in a very different content, I have seen "dereferencing" a null pointer in C++ not crash immediately,

I don't think this is possible at all in C, which doesn't have classes, and the sophisticated following of pointers to find a method.

Eh Java/js or whatever developers working on user applications can be very user minded simply by the fact that they work on a lot of user facing applications. I’ll redirect the user if it makes sense to me as a product. I don’t buy it that C forcing you to do a handle the input is advantageous in this regard.