Hacker News new | ask | show | jobs
by amluto 1426 days ago
Beyond a lack of memory safety, C has another issue that makes me dislike it for this kind of application: C has a very minimal set of built in data structures. Combined with a lack of generics, this means that using, say, a dictionary means that quite a bit of the implementation gets hard coded into every site that uses the dictionary. This is almost invariably done with lots of pointers (since C has no better-constrained reference type), and the result can be bug-prone and difficult to refactor.

For all of C++’s faults, at least it’s possible to use a map (or unordered_set or whatever) and mostly avoid encoding the fact that it’s anything other than an associative container of some sort at the call sites. This is especially true in C++11 or newer with auto.

3 comments

[WUFFS](https://github.com/google/wuffs) is made for stuff like this, and it has a library available as transpiled C code.
> this means that using, say, a dictionary means that quite a bit of the implementation gets hard coded into every site that uses the dictionary

I don't understand this part of your comment. There's nothing preventing you from designing a nice well-encapsulated map/dictionary data structure in C and I'm sure there are many many libraries that do just that.

I do agree though that having such basic data structures in the standard library, as modern C++ does, is usually preferable.

Lack of generics will do that, unless you consider that blindly casting `void ` all over the place counts as "well-encapsulated". Even with macro-soup designing a good agnostic dictionary implementation for C is rather challenging. Linked lists are okay* if you use something like the kernel's list.h, but even then it's macro-heavy and has its pitfalls.

In my work as an embedded developer I still use C a lot and it's probably the programming language I know best and have the most experience with but it would never cross my mind to write a PDF interpreter in it unless I had a tremendous reason to do so. There are so many better choices these days.

Type safety and encapsulation are distinct issues. The Linux kernel uses many well-encapsulated interfaces but it's written in C and the typing reflects that limitation.

Personally I haven't used straight C in years and would never choose it over C++ unless platform constraints required it, but a vast amount of very complex software has been and continues to be written in C, including all the widely used OS kernels, so I don't find it very surprising that a new feature in a very old piece of software would be written in it.

Except when you need to build from source; you'll need yet another whole compiler toolchain that may or may not behave well on a specific environment - eg, do you kow how well rust (or other "modern" language) works in late-nineties mips systems? The c compiler is the lowest common denominator.
> There's nothing preventing you from designing a nice well-encapsulated map/dictionary data structure in C

When you write a set function for your map data structure, what type do you make the key parameter?

Code from yalsat (stochastic SAT solver) [1] made me learn something two years ago. I can declare an array of some elements and make access to elements statically typed. Same with maps, sets and others.

[1] https://github.com/msoos/yalsat/blob/main/yals.c#L49

this is a pointer-based language so there are lots of ways to solve that, but you know that already.. this is a setup question.. of course its not useful to re-invent critical, secure functions over and over yet, what if I am not writing critical, secure functions anyway?

I would choose a key type that is natural to the environment and problem.. unsigned integers are useful. Which unsigned integer size? there are only a couple of practical answers to that.. unless there is some massive dataset, use a 32bit unsigned integer, like so much of the software does right now.

size_t key_size, void *key
And then eschew type safety
> nice well-encapsulated

...

> void *

Type safety and encapsulation aren't the same thing. Encapsulation is about hiding implementation details from the user of an API, which is what the comment I originally replied to was claiming you couldn't do in C.
The void * is (should have been!) an implementation detail, and you're leaking it in the interface - that's not encapsulation.

For example if I want to store a __int128 on a 64-bit machine I'll have to deal with stuff like memory allocation and lifetime myself, when the data structure should do that.

Code reuse is achievable by (mis)using the preprocessor system. It is possible to build a somewhat usable API, even for intrusive data structures. (eg. the linux kernel and klib[1])

I do agree that generics are required for modern programming, but for some, the cost of complexity of modern languages (compared to C) and the importance of compatibility seem to outweigh the benefits.

[1]: http://attractivechaos.github.io/klib