Hacker News new | ask | show | jobs
by tobiasu 4843 days ago
Last time this came up on HN, it was thought to be a clang feature.

Edit: While we're talking about dark corners, please stop casting functions that return void * . If your code lacks the declaration of the function, the compiler will assume pre ANSI-C semantics and generate code returning an int.

On machines where pointers do not fit ints (basically all 64bit machines), you just silently (due to the cast there is no warning) truncated a pointer. Worse, it may work depending on the malloc implementation and how much memory you allocate.

We have to fix these kinds of bugs on OpenBSD a lot, please help by typing less and let the compiler warn you about silly mistakes :-)

And yes, C++ fucked this up for C. I'll leave it to Linus to say something nice about that..

7 comments

> If your code lacks the declaration of the function, the compiler will assume pre ANSI-C semantics and generate code returning an int.

Better, please help by compiling your C code with -Wimplicit-function-declaration (included in -Wall), and fixing all the problems it reports. Then you won't have to worry about this problem, or a bunch of other problems.

All-too-common response: "Oh, but they're just warnings mumble compiles and runs 'cleanly' mumble..."

    -Wall -Wextra -Werror
It's a C99 feature, and clang's the only compiler I know of that produces the diagnostic (and only in later versions)

btw; I wrote this talk.

I didn't know that. To answer my previous question, clang doesn't fire a warning when passing pointers of any size to the function foo.

And by the way, nice talk, it's great learning these dark secrets of C.

The compiler can't know at compile time with a naked pointer like it can with an array. [static 1] is handy to say it must not explicitly be NULL, as if it were optional, however.
Yes, but I expected some kind of "You're passing a pointer as an array of size n. I can't check the size, but you should make sure you've checked it".
I can't imagine that would be anything but noise. 99% of my function calls have pointers passed through, not arrays.
But a pointer is not the same thing as an array, a pointer does not carry the size of the allocated space which an array does in the same scope.
Good, because I was missing a slide:

= Bitfields =

Not even once.

;)

Everybody knows about bitfields :)
Bitfields have some crazy fun dark corners, though. For instance, which values can this bitfield hold:

    int b:1;
Signed two's-complement n-bit integers can hold values from 2^(n-1)-1 to -2^(n-1), so in theory it can hold 0 and -1, but many compilers get that one wrong. Always declare one-bit bitfields as unsigned. (The Sparse static analyzer will warn you about that one.)
In my university they always told me to cast the return of void * functions, I thought it was just to avoid the warning saying "automatic cast of void* to int*" or similar.

Thanks, I will take that into account.

It is true in C++, not in C where void* needs no casting.

People tended to confuse C++ with C more in the past, as they had not diverged as much as they have now.

An annoying thing C++. Anyone know the history/rationale of this change?
It is supposed to provide you more checking by disabling automatic cast from void* to any other pointer. This makes sense in C++ since casting a pointer to a class can trigger some address adjustement if the target class of this instance pointed to is multiple derived (and maybe in other cases?). There is no way such adjustment can happen if the source type is void*, because then you don't know what the source type really is.
As far as I know, this is exactly the rationale, actually. Say you have:

  struct a { int a;}
  struct b { int b;}
  struct c : a, b { };

  c myc;
  c* p1 = &myc;
  b* pb1 = p1; // correctly points to myc's base b, which is offset

  void* p2 = &myc;
  b* pb2 = p2; // Would not point at instance of b, compile error
If I need void*-casting code to compile on both C and C++ compilers, I use a macro like this:

  #ifdef __cplusplus
  #define STATIC_CAST(T, EXPR) static_cast<T>(EXPR)
  #else
  #define STATIC_CAST(T, EXPR) (EXPR)
  #endif
This leaves the conversion to be implicit in C, and uses the stricter static_cast in C++ to catch certain types of likely-unsafe conversions, such as the aforementioned cast from int to pointer.
> While we're talking about dark corners, please stop casting functions that return void *

The problem is, if I want my C code to compile with MSVC, it has to compile as C++ - and even if I abhor Windows for development myself, a lot of developers are using MSVC.

I just wish Microsoft would update their C compiler, at least to C90. But then I suppose the standard has only been around for 23 years, and nobody really uses C anyway.

> The problem is, if I want my C code to compile with MSVC, it has to compile as C++

As someone who has compiled a lot of .c files that do not cast void pointers with cl.exe, I'd say no, this is not true... Maybe what you're trying to say is that your code relies on C99 features that are also present in C++? IMO if that's what's holding you back it's much easier to just write C89, maybe with the occasional ifdef, than to suddenly write some crummy C/C++ hybrid. That or just use mingw. (Unless you're doing SEH. I'm not aware of a good way to do SEH with mingw.)

You're suggesting I write C89 in 2013 just to support MSVC? C89 in which you can't even mix data and code?

I think casting void pointers is very much the lesser of two evils. And of course I can use MinGW, but that doesn't mean the majority of Windows developers don't insist on using MSVC.

C++ is not a superset of C[0]. If you want to use MSVC and newer features, then you're not writing C anymore; you're just writing an awkward C++ program.

[0] http://en.wikipedia.org/wiki/Compatibility_of_C_and_C%2B%2B

Nope, I'm definitely writing C99, but it just so happens that MSVC is happy enough pretending it's C++.

Definitely writing C99 in that I compile the exact same source files with -std=c99 -Wall -pedantic when using a real C compiler.

> C89 in which you can't even mix data and code?

I don't think this is as big a deal as you're making it out to be. Maybe I'm just used to it, but this is one area where I consider the C89 way to be better on stylistic grounds. (I wrote about this on a stack exchange site some time ago: http://programmers.stackexchange.com/questions/75039/why-dec...)

Realistically I don't think C99 is all that revolutionary, and the fact is there are plenty of people writing C89 in 2013. Let's look at some things C99 adds over C89:

* Mixed declarations and code

IMO not a huge deal, for reasons I give above.

* VLAs

A nice feature, but people have been using malloc for similar purposes for ages, so it's hard to say it's any better than "nice to have". I'd also argue that something that leads me to potentially consume arbitrary amounts of stack space based on runtime decisions is probably not universally good, and I'm pretty cautious about using VLAs even when given the choice.

* Last field of a struct can be variable sized array, eg. struct foo { int num_data; int data[]; };

I like this pattern and have used it a lot. Including with VC++ by having a zero-sized array at the end of the struct. This is once place where I use an ifdef to bridge the gaps because recently I noticed LLVM started doing weird things for this if you don't do it the "legit" C99 way.

* Cosmetic issues like // comments

VC++ already supports this in C89 files as a non-portable extension.

* Portable typedefs like uint32_t, etc.

These are also pretty handy. But if you're building with VC++ you're probably using the non-portable Microsoft ones anyway. (I think VC++ 2010 added the C99 versions too.)

* bool type

Again, every platform and even reasonably sized library works around this with a typedef and some macros.

* Slightly different behaviors for some libc functions

I notice this most for snprintf, where the pre-C99 return value [followed by Microsoft] means something different [and better] in C99. Again there are non-portable solutions for this that you can use ifdefs for.

* Named initializers for struct members

Cool feature. I'd hardly say I couldn't live without it, though. The only place I've really seen it used extensively is in the Linux kernel.

* Stuff that no one uses, such as "static" as used in these slides, or complex numbers, or whatever.

Not a problem since these don't see wide use.

Mixed declarations and code are absolutely essential in a modern C programming style. It's all very well to declare things like "your functions should be shorter" or "you shouldn't use temporary variables in macros", but it's just not practical in real world code. In fact, it's so impractical they amended the spec 14 years ago.

Then there are things like the portable integer types - which you decided aren't necessary because everyone is writing Win32 code (???), the designated initializers which I use every day to make my struct initializing code safe from changes in the struct itself, for loops, a consistent bool type ... heck, I could go on all day if you hadn't already dismissed the rest of the standard as "stuff that no one uses".

I really don't mean to come across as aggressive, but this kind of ignorant and dismissive attitude is so rampant in the tech scene at the moment. And it's getting us nowhere.

Not meaning you to make you feel like you're inching towards the aggressive, I just feel like the distance between C89 and C99 is not a very big one. Even in C99 I tend to use some perhaps conservative style choices, and in this view the C99 stuff really does seem like "niceties" rather than essentials. The difference between C and C++ feels much bigger to me.

And then after all, you initially explained the need for this in terms of Windows-specific code (or at least that's how I read it), so that's the angle I took...

It's better to

typedef void ( * fptr_t)()

and cast to fptr_t instead of void * But it's bigger problem how to force users to cast it back to proper prototype, because casting it back to something else will give UB.

"On machines where pointers do not fit ints (basically all 64bit machines), you just silently (due to the cast there is no warning) truncated a pointer. Worse, it may work depending on the malloc implementation and how much memory you allocate."

Wow! Ugly! Scary! Another good reason to know in fine details just what cast does. At one point, my Visual Basic .NET code actually calls some old C code, and in time I will need to convert to 64 bit addressing. So, I will keep in mind that with 64 bits I have to be especially careful about pointers and C.

Well, the bug only appears if you didn't include the header file declaring malloc (or any other void*-returning function). If you did, there's no problem.

It's a pretty easy thing to leave out, missing a required header, but if you always compile under -Wall it'll catch this and many other problems as well.

Thanks for the details, but I remain concerned about taking a language, clearly with some tricky aspects, so closely identified with 16 and 32 bit computing into 64 bit computing.
It seems to me that the problem is that you're using an undefined function (when there are flags to stop this), not casting.