Hacker News new | ask | show | jobs
by fithisux 2853 days ago
The title is completely misleading.
3 comments

I write a ton of C and I completely agree with the title. With 20+ years of experience.

Kernel drivers and embedded system bare metal firmware.

The problem with C is that in any bigger project something always slips through even the best programmers, reviewers, static analysis and unit tests. And that something can lead to disastrous crashes and security vulnerabilities.

Don't you think that with the tools we have now it's easier to control the quality of code produced (Clang memory sanitizers and so on)? I feel more at ease to ship C code today after instrumenting it than a few years ago...
Tooling absolutely helps to reduce defects. That's why you use them.

That said, sometimes I'm shocked what kind of disasters get past the analyzers.

Stakes are higher than ever. It's not just about functional correctness and avoiding crashes anymore. Your code needs to be secure against outside world malicious actions. Getting rid of counterintuitive security vulnerabilities is very, very hard.

I would say that is why security conscious developers use them.

Sadly we are a very very tiny percentage, as proven by Herb Sutter question to the audience at CppCon (1% of the audience answered positively), and CVE frequent updates.

Not really, as it is proven almost on daily basis.

https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=memory+corr...

How do you know that developers working on those used tools such as the Clang Memory Sanitizer?
Because many on that list are well known FOSS projects that supposedly have such processes in place, including manual review before accepting patches into mainline, like the Linux kernel being discussed here.
For embedded systems I mostly go with "dynamic memory allocation of any kind is evil" and that solves a lot of issues already.

You can still overwrite memory but it suddenly became much less likely.

> For embedded systems I mostly go with "dynamic memory allocation of any kind is evil" and that solves a lot of issues already.

Yeah, bare metal systems often don't allocate at all. Although one sin they often do commit is using same buffer for multiple purposes. What could go wrong...

Perhaps even more common is allocating a buffer on stack and writing past bounds somehow. Also DMA to/from stack is usually not a great idea...

Above things sound dumb, but can easily happen when you build your abstraction layers and use them carelessly.

> DMA to/from stack

wait what oh my god

That only eliminates a certain case of bugs. There are still plenty of foot-shotguns available - memcpy/memset, strlen, gets/puts, printf, any file IO, networking calls, etc.
This is my view as well, from the same industry. However, the quality of the tools available in C to deal with its issues far exceed those in any other language. I would love to drop C from all my systems, but the alternatives simply aren't there.
The alternatives were there before UNIX took over server room and workstation market.

Just imagine how many millions the IT industry and PhD research have spent developing solutions that would improve C's safety, many of them largely ignored by most C developers.

> The problem with C is that in any bigger project something always slips through even the best programmers, reviewers, static analysis and unit tests.

That's also true of all the other languages.

Well, I can say the same about Python, Erlang, Lua, in addition to C and C++. I believe C is not worse than these languages, only that C requires different (sometimes very different) skills and discipline.
I'm absolutely sure same skill level programmer will create less defects in Python, Erlang and Lua than in C. You really have to try to overwrite memory in those languages.

Of course you can shoot yourself into foot with stuff like metatables in Lua and Python metaclasses and whatnot. Then again you should see some C macro messes around...

Anyways I don't like when people defend C with that age old argument it requires a clever disciplined programmer that never makes mistakes. Because either such programmers don't exist or they're very rare.

> I'm absolutely sure same skill level programmer will create less defects in Python, Erlang and Lua than in C.

Fewer defects, or just different (arguably less severe) defects? It's great that you're sure, but evidence would be even better.

Ok, that's a fair point. I don't have the evidence for that.

Scripting languages do have their pitfalls. Lua and python can have type mismatches and even typos causing misbehavior, things that usually aren't issues with C.

However, you do need significantly less code than in C.

Python, Erlang, Lua = Logic Errors

C and C++ = Logic Errors + Memory Corruption + UB

From this point of view,

Σ Logic Errors < Σ (Logic Errors + Memory Corruption + UB)

hmmm... In my experience, I have had much less logic errors in C++ than in Python or JS because I tend to try to encode the domain logic into the types as much as possible, so that I can piggyback on the compiler.
And how many memory corruption and UB errors did your Python and JS code had?
none, but I hardly ever encounter them in C++ too since I always develop in debug mode with sanitizers and debug std containers, so they blow up immediately. In C that's another story...
I agree, it's very click-baity. C is actually great and is only really dangerous because it gives the programmer so much control.
I'm a C coder first and foremost and I strongly disagree with this mentality (even though I know it's extremely pervasive in our circles). "Footguns don't make bugs, coders do" is technically true but if we could keep the footguns at a minimum and only get them out of the locker when truly necessary instead of having them spread all over the place all the time I'm sure it wouldn't hurt.

C is a very useful language and one you basically have to know if you're interested in low level software but it's very, very far from flawless.

If you look at many high profile software vulnerabilities of late (heartbleed, goto fail, etc...) many can be traced to the lack of safety and/or bad ergonomics of the C language.

We need to grow up as an industry and accept that using a seatbelt doesn't mean that you're a bad driver. Shit happens.

> C is actually great and is only really dangerous because it gives the programmer so much control.

This doesn't actually refute the assertion that C is dangerous :)

Control and increased safety are not mutually exclusive. I'll take safe-by-default, unsafe-when-asked any day. It's not 1972 anymore.

"Programmers using C are considered dangerous"
C actually gives one rather limited control over modern hardware with it's memory hierarchies and superscaler CPUs. Programming language research has also moved on a lot since the 70's, which is why we should be considering less dangerous languages (e.g. better type systems and less undefined behaviour). Languages like ATS and Rust also support explicit memory management, whilst being a whole lot safer.
C alone doesn't provide the control directly, but you as a programmer can absolutely leverage C to take control of the memory hierarchies by controlling your data access patterns. IOW, high locality of reference.

Good C-compilers will most of the time take care of the superscalar CPU friendliness. When they don't, you can always drop down to the assembler level, and it'll mesh well with C.

High-locality of reference can be achieved in any language that supports unboxed types, it doesn't require C (even a very high-level language like Haskell has support for this). But this is a long way from having complete control how each memory heirarchy is used.

Likewise most static languages defer to the compiler for CPU-specific performance optimisations and will permit foreign native calls into C or ASM where necessary. So I don't see how this is an argument in C's favour.

> High-locality of reference can be achieved in any language that supports unboxed types, it doesn't require C (even a very high-level language like Haskell supports this).

You often also need correct alignment. Cache-line or page. Your unboxed access across two pages can cause two TLB misses, L1 misses etc. Not to mention two page faults.

Sometimes you need to ensure two (or more) buffers are NOT aligned in a particular way to avoid interfering with CPU caching mechanisms.

The only support C is giving you for this is that it has sized unboxed types (and raw pointer access). Even then, you'd have to trust the compiler and take measurements to be sure.
Even in the 70's there was NEWP, PL/I, PL/S, PL/8, Concurrent Pascal, Mesa, BLISS, Modula-2, ....

C wins them all in implicit conversions and opportunities for memory corruption.

Their major sin was to be tied to commercial OSes, instead of one with source code available for a symbolic price to universities.

Are you suggesting that other languages provide more control over modern hardware?
Yes. Currently access to modern hardware features are either via cumbersome APIs (e.g. NUMA, AVX intrinsics), handled via the OS (e.g. paging, scheduling), or handled via the hardware itself (cache memory hierarchy). The problem will get worse as modern CPUs and machines continue to diverge from those originally targetted by C in the 1970s.