Hacker News new | ask | show | jobs
by lelanthran 842 days ago
I think every graduating student should work on a non-trivial application in plain C for a year before moving on to another language.

It makes you exceptionally paranoid about failure states and practically requires a bit of thought and planning before attempting any non-trivial change.

The mindset of "it's fine to ignore all error conditions and let the default exception handler print a stack trace to the user" results in software that is annoying to the user.

5 comments

While I agree with last two paragraphs, C is not good even for that purpose because it doesn't give any tool to manage them. An effective C education should really be paired with various static analyses and formal verification strategies.
> While I agree with last two paragraphs, C is not good even for that purpose because it doesn't give any tool to manage them.

That is the reason it should be used as a learning tool. So that you know the nitty gritty details without anyone "managing" it away from you.

Then learn an assembly language instead, because C also has a fair amount of bookkeeping hidden behind the scene.

C is typically described as a "low-level" programming language, where the "low-level" normally refers to the supposed distance from the language to the actual hardware. But as many incidents with UB demonstrate, this distance is still quite larger than expected. I think there is another sense of the word "low-level", which is the amount of abstractions that are either built into the language or allowed for users, and C doesn't have a lot of them.

Combined together they represent related but distinct axes of controllability, and C only achieves a modest level of controllability in one axis but not in another. The ideal language with controllability in comparison should minimize the distance to the machine and maximize an amount of abstraction to control anything below the language instead.

> Then learn an assembly language instead

Learn assembly TOO, not instead. I did, as part of computer architecture course. Very valuable. I think you should learn everything from transistor level up if you want to do serious programming. I don't think you need to actually use it, but it's sometimes very handy to know those things.

It's important to set a correct expectation for what you learn. C is indeed important to learn because of all legacy and current code bases, but it's just one of possible language choices for learning computer architecture, and learning C alone doesn't give you a relevant knowledge.
> C also has a fair amount of bookkeeping hidden behind the scene

Hosted C does in the form of the standard library. C does have a freestanding variant though and its book keeping is generally limited to knowing struct member offsets.

This is mostly only true for modern optimising C compilers. If you take a simple C compiler from the 90's, or disable optimisations in a modern compiler, there's a near 1:1 relationship between the C code and compiler output.
Yes, but I don't think you necessarily argue for using those simpler compilers today.
Of course not, but it might explain where the idea comes from that "C is high level assembly". Apart from that, I think every programmer should play around with godbolt at least once in a while to get an idea how the high level source code maps to machine code. Even with optimizations enabled, the output of C code is usually much closer to the source (e.g. more "recognisable") than (for instance) highly abstracted C++ or Rust code.
Well, there are such tools for C, but wouldn't using them be detrimental in this context? Think, like using a debugger vs. trying to wrap the execution in one's mind: I'not saying that one shouldn't use debuggers, but not using one has benefits, as a teaching device.

Like running in a weight vest.

Edit: ah, perhaps you meant, in addition to using raw C, one should also learn how to use such static analyzers & cie

> Edit: ah, perhaps you meant, in addition to using raw C, one should also learn how to use such static analyzers & cie

Exactly. Sorry for my unclear wording.

As the author notes, to know what C code does you need to run it. A good debgger is a C programmers best friend.
Not necessarily:

> A year or two after I'd joined the Labs, I [Rob Pike] was pair programming with Ken Thompson on an on-the-fly compiler for a little interactive graphics language designed by Gerard Holzmann. I was the faster typist, so I was at the keyboard and Ken was standing behind me as we programmed. We were working fast, and things broke, often visibly—it was a graphics language, after all. When something went wrong, I'd reflexively start to dig in to the problem, examining stack traces, sticking in print statements, invoking a debugger, and so on. But Ken would just stand and think, ignoring me and the code we'd just written. After a while I noticed a pattern: Ken would often understand the problem before I would, and would suddenly announce, "I know what's wrong." He was usually correct. I realized that Ken was building a mental model of the code and when something broke it was an error in the model. By thinking about how that problem could happen, he'd intuit where the model was wrong or where our code must not be satisfying the model.

A debugger can only tell you what the executable does: the program compiled with a particular compiler on a particular platform, with particular code generation options.

It can tell you that i = i++ increased i by 2, for instance. That might not even be true of another instance of i = i++ in the same object file being debugged.

There is no substitute for knowing what the C will do before it is run.

> While I agree with last two paragraphs, C is not good even for that purpose because it doesn't give any tool to manage them.

The "purpose" here is to instill a sense of paranoia around error handling, so that the resulting program from the PoV of the user appears to handle whatever bizarre combination of inputs the user put in.

It's the difference between telling the user "Failed to open foo.txt (file not found). Do you want to create it?" and ending the program with a 100-line stack trace with "FileNotFoundException" buried somewhere in there.

As the article says, checked exceptions are not the solution here.

I've literally never seen, in a professional working environment, exception languages (Java, C#, Python, etc) actually check that the file they tried opening was actually opened, and if it wasn't, directing the user with a sensible message that allowed the user/operator to fix the problem.

In C, the very first time that you fail to check that `fopen` returned non-NULL, the program crashes. Then you check if it returned NULL, and need to put something in the handling code, so you look at what `errno` has, or convert `errno` to a string.

I will bet good money that you could grab the nearest C#/Java/Python/JS/etc engineer to you, ask them to find the most recent code they wrote that opened and read/wrote a file, and you'll find that there is literally no code to direct the user if the file-open failed. The default runtime handler steps in and vomits a stack trace onto the screen.

In C, you are forced to perform the NULL-check, or crash. Sure, many devs are simply going to have a no-op code-path for the error cases, doing `if ((inf = fopen(...)) != NULL) { DoSomethingWith(inf);}`, but proceeding on success is a code-smell and easy to visually spot as an error.

The exception languages make it virtually impossible to spot the code-smell of handled (or improperly handled) exceptions, and make it easy because the dev can just read the stack trace, create the file needed, and proceed with programming the rest of the app.

What a good program must do when a file open failure is encountered is direct the user in some way that they can fix the problem. For example "file doesn't exist. Create it [button], Choose a file [button]", or "Permission denied. Try running as a different user.", or "File is locked. Are you already running $PROGRAM?", or "$FOO is a directory. Specify a file.".

[EDIT: Yes, seeing a stack trace in a shipped product is one of my personal bugbears that I feel very strongly about. If it's a stack trace for an expected error (like failure to open/read/write a file) I absolutelydo get annoyed by this public display of laziness. And yes, this is one of those hills I'll die on before I leave it!]

Even the simplest c/line programs annoy me no end when the application simply dumps a stack trace to the screen. Sure, I can dig into it, but the average user is going to ask for help on stackoverflow, just to figure out what must be done to fix the error.

> As the article says, checked exceptions are not the solution here.

Java had a very bad model of checked exceptions (the OP was written in 2008). A correct way is to make it a part of the type system, though it doesn't have to be a sum type like Rust, and make any error-related code path as convenient as possible to use.

> In C, you are forced to perform the NULL-check, or crash.

You don't necessarily crash if you failed to perform a NULL check! That's literally the single biggest problem with C's undefined behaviors. C looks like forcing checks only because most programmers do understand crashes are bad, so they do prepare for trivial or demonstrated crashes. But that's not guaranteed, and they can't easily prepare for non-crash failures without additional tools.

> You don't necessarily crash if you failed to perform a NULL check!

In the case of using the result from `fopen`, I don't know of a platform where a dereferencing of NULL (which happens in a separate translation unit, which is already compiled and linked, and will not be subject to LTO and other optimisations) within the various read/write/seek/tell functions doesn't result in an immediate crash.

I fully admit that this is applicable only to this particular example, and to all the functions in the stdlib. Everywhere else (code you wrote, that will be subject to aggressive optimisation, for example), you may not necessarily crash on a NULL dereference.

In the sense of instilling a sense of paranoia, the relative frequency of crashing due to UB is high enough that it does develop the sense of paranoia.

That's a really huge asterisk that wasn't present in your original claim ;-)

> In the sense of instilling a sense of paranoia, the relative frequency of crashing due to UB is high enough that it does develop the sense of paranoia.

Paranoia isn't a cure however. A good programmer will and arguably should develop an instinct to avoid C for most cases instead. I too have written tons of C codes, and yet I feel really uneasy about using C at all. I can't believe that C merely induces the sense of paranoia.

> I don't know of a platform where a dereferencing of NULL (which happens in a separate translation unit, which is already compiled and linked, and will not be subject to LTO and other optimisations) within the various read/write/seek/tell functions doesn't result in an immediate crash.

Although in a very different content, I have seen "dereferencing" a null pointer in C++ not crash immediately, if you dereference it to call a nonvirtual class member function, e.g,

    t->foo();
Depending on how this gets compiled and the implementation of `foo()`, the segfault may not come at the line above, where technically `t` is being dereferenced. It may come inside `foo`, or somewhere further down the call chain. The resulting crash may not even manifest as a segfault.
> Although in a very different content, I have seen "dereferencing" a null pointer in C++ not crash immediately,

I don't think this is possible at all in C, which doesn't have classes, and the sophisticated following of pointers to find a method.

Eh Java/js or whatever developers working on user applications can be very user minded simply by the fact that they work on a lot of user facing applications. I’ll redirect the user if it makes sense to me as a product. I don’t buy it that C forcing you to do a handle the input is advantageous in this regard.
When I was a student, I struggled learning algorithmic because we had to code in C. Luckily we had a wonderful teacher who listened to us, students, and switched his teaching to Pascal. That was a big relief to all of us.

I believe language like Pascal and Python are good to initiate you to algorithmic. After that, I agree you would need to dive into languages such as C, Rust and why not Assembly language to have a better understanding of your machine.

In our case, after one year of coding with Pascal, we spent the remaining years in focusing on C and C++ (Builder)

After that, it's time to move to pragmatic choices (job market requirements in terms ofdevelopment stack)

C was the first language covered at my University back in the early 2000s

It was not a problem for me as I was learning Turbo C and Visual C++ 5/6.0 a year or two beforehand.

Everyone else in the class, though, were sooooo frustrated with their "89 errors, 103 Warnings" all because they forgot to add a semicolon in the code.

Truth is they were not getting anywhere to understanding how to write or care about the quality of the code.. leading to proper planning, etc. They would keep changing something to reduce the errors/warnings.

Personally, I think every person has their own journey into the world of programming. For me, I was happy for it to be C, with a bit of Pascal and Visual Basic. For someone else, perhaps Scheme and Javascript. Another maybe Java.

Some developers/programmers, in my opinion, are not for C. controversial ... I know.

I think writing in Haskell or Rust and using ADTs and pattern matching with exhaustivity checking does the same thing.
Rust is better for this since your program won’t even compile until you’ve handled those error states.
> Rust is better for this since your program won’t even compile until you’ve handled those error states.

How is that better for developing a sense of paranoia around error states?

"Throw it at the wall and see what sticks" does not exactly lead to "extreme paranoia managing errors".

You're placing a lot of trust in it not being undefined behavior.

Additionally, of course anything can be done with enough time and effort but the costs add up when you are doing things manually rather than letting the compiler handle it. Compiler has had many person-years of effort spent in ensuring the output is correct, can we do the same for all the code we write?

It helps because the instinct is to write lazy code that may fail and the rust compiler will annoy you until it compiles. Every annoyance to fix is a thing you usually need to think about in other languages.

For example, if I'm coding in C# it's easier for me to understand the impact of passing our resources that need to be disposed and good patterns to handle that after Rust has made me lose hairs on this concept.

This raises the question of whether that extreme paranoia prevents critical errors in code. The fact that so many of the issues still persist suggests it does not.

You can tell people to be careful drivers all you want, but what really saves lives is airbags, crumple zones, and seatbelts.

Because you can't let a driver crash and have a fatal accident as a lesson. That's what this is about. Make it hard and cumbersome, by having students build something non-trivial. I think another benefit of doing it is the Ikea effect, after you've put some effort into the project and see it come together, you might start caring about it and are motivated to get it to work well. Hopefully then some of that mindset carries over when using high level languages and huge frameworks.
> This raises the question of whether that extreme paranoia prevents critical errors in code. The fact that so many of the issues still persist suggests it does not.

> You can tell people to be careful drivers all you want, but what really saves lives is airbags, crumple zones, and seatbelts.

Could just be that the frequency of repeated accidents (and the related injuries) is too low to instill paranoia. The analogy is not the same as with programming in C, where the frequency is "multiple times a day", and not "less than once in a lifetime.

IOW, I feel that

> The fact that so many of the issues still persist suggests it [extreme paranoia] does not.

is inaccurate.

Back when every developer had to build their own failure detection code and display messages to help themselves debug anything that meant that almost every developer was good at communicating errors to the user.

Today developers don't build that skill, so you see applications that just fails silently everywhere or produce nonsense errors. The better developer tools you have the less your developers will need to learn UX skills to be able to do their work.

Back in those days the error you got was "Segmentation fault (core dumped)"

Now its an exact line number with a little description of what went wrong and sometimes a suggestion on how to fix it.

Agreed. And the example of crashing due to a NULL file descriptor is in the minority. Often C error handling involves checking errno. This is routinely ignored, and when some call "fails", the code continues to some point later where things don't quite work right. When this happens, it's usually much harder to determine the cause, given that the root failure context is gone.

I'm my experience, C developers sometimes just dislike exceptions in other languages because exceptions defy C code path expectations. That was my first instinct when going from C to other languages. And so additional arguments against exceptions are put forth, including performance issues (which are real) and this idea of paranoia being better than language tooling (which is rather suspect imo).

> And so additional arguments against exceptions are put forth, including performance issues (which are real) and this idea of paranoia being better than language tooling (which is rather suspect imo).

I feel you are mischaracterising my position, which was to have graduating students work in C for a non-trivial amount of time before they moved on to a new language.

The argument that paranoia is better than language tooling is entirely absent from my arguments.

It will compile just fine if you call unwrap on a Result and then you get a wonderful error. If you call unwrap on a File::open() you do not even get the file name.

If you run this

use std::fs::File; pub fn main() { let fp = File::open("test").unwrap(); }

You get this:

thread 'main' panicked at /app/example.rs:11:33: called `Result::unwrap()` on an `Err` value: Os { code: 2, kind: NotFound, message: "No such file or directory" } note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

The language incentives you to propagate errors up to main and then just let it fail.

You still have to explicitly do this and are made aware of this failure state. It's also much easier to reject PRs when you see an unwrap than to try to think of all the ways it could invisibly fail.