| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by graycat 4890 days ago

For "dark corners of C", when I was writing C code I had several serious concerns. Below I list eight such in roughly descending order on 'seriousness':

First, what are malloc() and free() doing? That is, what are the details, all the details and exactly how they work?

It was easy enough to read K&R, see how malloc() and free() were supposed to be used, and to use them, but even if they worked perfectly I was unsure of the correctness of my code, especially in challenging situations, expected problems with 'memory management' very difficult to debug, and wanted a lot of help on memory management. I would have written my own 'help' for memory management if I had known what C's memory management was actually doing.

'Help' for memory management? Sure: Put in a lot of checking and be able to get out a report on what was allocated, when, by what part of the code, maybe keep reference counters, etc. to provide some checks to detect problems and some hints to help in debugging.

That I didn't know the details was a bummer.

It was irritating that K&R, etc. kept saying that malloc() allocated space in the 'heap' without saying just what they meant by a 'heap' and which I doubt was a 'heap' as in heap sort.

Second, the 'stack' and 'stack overflow' were always looming as a threat of disaster, difficult to see coming, and to be protected against only by mud wrestling with obscure commands to the linkage editor or whatever. So, I had no way to estimate stack size when writing code or to track it during execution.

Third, doing data conversions with a 'cast' commonly sent me into outrage orbiting Jupiter.

Why? Data conversion is very important, but a 'cast' never meant anything. K&R just kept saying 'cast' as if they were saying something meaningful, but they never were. In the end 'cast' was just telling the type checking of the compiler that, "Yes, I know, I'm asking for a type conversion, so get me a special dispensation from the type checking police.".

What was missing were the details, for each case, on just how the conversion would be done. In strong contrast, when I was working with PL/I, the documentation went to great lengths to be clear on the details of conversion for each case of conversion. I knew when I was doing a conversion and didn't need the 'discipline' of type checking in the compiler to make me aware of where I was doing a conversion.

Why did I want to know the details of how the conversions were done? So that I could 'desk check' my code and be more sure that some 'boundary case' in the middle of the night two years in the future wouldn't end up with a divide by zero, a square root of a negative number, or some such.

So, too often I wrote some test code to be clear on just what some of the conversions actually did.

Fourth, that the strings were terminated by the character null usually sent me into outrage and orbit around Pluto. Actually I saw that null terminated strings were so hopeless as a good tool that I made sure I never counted on the null character being there (except maybe when reading the command line). So, I ended up manipulating strings without counting on the character null.

Why? Because commonly the data I was manipulating as strings could contain any bytes at all, e.g., the data could be from graphics, audio, some of the contents of main memory, machine language instructions, output of data logging, say, sonar data recorded on a submarine at sea, etc. And, no matter what the data was, no way did I want the string manipulation software to get a tummy ache just from finding a null.

Fifth, knowing so little about the details of memory management, the stack, and exceptional condition handling, I was very reluctant to consider trying to make threading work.

Sixth, arrays were a constant frustration. The worst part was that could write a subroutine to, say, invert a 10 x 10 matrix but then couldn't use it to invert a 20 x 20 matrix. Why? Because inside the subroutine, the 'extents' of the dimensions of the matrix had to be given as just integer constants and, thus, could not be discovered by the subroutine after it was called. So, basically in the subroutine I had to do my own array indexing arithmetic starting with data on the size of the matrix passed via the argument list. Writing my own code for the array indexing was likely significantly slower during execution than in, say, Fortran or PL/I, where the compiler writer knows when they are doing array indexing and can take advantage of that fact.

So, yes, no doubt as tens of thousands of other C programmers, I wrote a collection of matrix manipulation routines, and for each matrix used a C struct to carry the data describing the matrix that PL/I carried in what the IBM PL/I execution logic manual called a 'dope vector'. The difference was, both PL/I and C programmers pass dope vectors, but the C programmers have to work out the dope vector logic for themselves. With a well written compiler, the approach of PL/I or Fortran should be faster.

It did occur to me that maybe other similar uses of the C struct 'data type' were the inspiration for Stroustrup's C++. For more, originally C++ was just a preprocessor to C, and at that time and place, Bell Labs, with Ratfor, preprocessors were popular. Actually writing a compiler would have permitted a nicer language.

Seventh, PL/I was in really good shape some years before C was started and had subsets that were much better than C and not much more difficult to compile, etc. E.g., PL/I arrays and structures are really nice, much better than C, and mostly are surprisingly easy to implement and efficient at execution. Indeed, PL/I structures are so nice that they are in practice nearly as powerful as objects and often easier and more intuitive to use. What PL/I did with scope of names is also super nice to have and would have helped C a lot.

Eight, the syntax of C, especially for pointers, was 'idiosyncratic' and obscure. The semantics in PL/I were more powerful, but the syntax was much easier to read and write. There is no good excuse for the obscure parts of C syntax.

For a software 'platform' for my startup, I selected Windows instead of some flavor of Unix. There I wanted to build on the 'common language runtime' (CLR) and the .NET Framework. So, for languages, I could select from C#, Visual Basic .NET, F#, etc.

I selected Visual Basic .NET and generally have been pleased with it. The syntax and memory management are very nice; .NET is enormous; some of what is there, e.g., for 'reflection', class instance serialization, and some of what ASP.NET does with Visual Basic .NET, is amazing. In places Visual Basic borrows too much from C and would have done better borrowing from PL/I.

6 comments

to3m 4890 days ago

I think C might make more sense if you are more familiar with assembly language. I learned C because real-mode x86 looked so fantastically ugly (looking back, a rare instance of youthful good taste). 0-terminated strings and stack allocation were quite familiar to me (though I never used stack allocation myself because it made the disassembly hard to read) and the overall model made perfect sense.