Hacker News new | ask | show | jobs
by apaprocki 4843 days ago
> Fourth, that the strings were terminated by the character null usually sent me into outrage and orbit around Pluto.

Everything is about tradeoffs. Fortran uses space-padded strings with no null terminator. On the positive side, this forces everyone to explicitly pass the length they mean instead of relying on more work at runtime to figure out when to stop by looking for the null sentinel. Passing explicit lengths is good practice in C anyway because you usually avoid having to scan the contents multiple times / multiple calls to strlen at different levels in the stack. While everything should be better in the Fortran case, the class of bugs that persist are even more hard-to-find bugs because poorly written code mis-calculates the length, ignores it, etc., stomping over adjacent memory. This probably won't crash, and since other code has to use an explicit length when accessing the buffer, you usually won't notice the problem at the source of the issue. Contrast that with C, where you're more likely to see an issue immediately as soon as the string is used or passed to something else.

tl;dr Poor programming is poor programming in any language.

1 comments

Yup.

With PL/I the maximum length of the string is set when the string is allocated, usually dynamically during execution. The length can be given as a constant in the source code or set from computations during execution. There is also a current length <= the maximum length. When passing that string to a subroutine, the subroutine has to work a little to discover the maximum string length, but, by in effect 'hiding' both the current and maximum length from the programmer of the subroutine, the frequency of some of the errors you mentioned should be reduced.

In Visual Basic .NET, the maximum length of any string is the same, as I recall, 2 GB. Then having the strings be 'immutable' was a cute approach, slightly frustrating at times but otherwise quite nice and a good way to avoid the problems you mentioned.

But, of course, the way I actually used strings in C was close to the way they were supported in Fortran.

And, of course, likely 100,000+ C programmers wrote their own collection of string handling routines where use a struct to keep all the important data on the string, say, allocated or not, pointer to the allocated storage, maximum allocated length, current length, etc. (multi-byte character set anyone?) and then pass just a pointer to the struct instead of a pointer to the storage of the string; in this way, again should reduce the frequency of some of the errors you mentioned.