Hacker News new | ask | show | jobs
by mrheosuper 267 days ago
can we just do `if(*ptr == NULL) return;` ?
3 comments

If «ptr» is not a valid pointer, an attempt to dereference it (i.e. *ptr) will most assuredly crash the process with a SIGSEGV.
But when would it not be a valid pointer, and yet also not a null pointer? A null pointer we can check for easily.
A null pointer is not a valid pointer in a predominant number of systems in existence. If malloc (3) has returned a NULL, *ptr will cause a SIGSEGV.

Embedded systems are an exception, though. They may not have a MMU, and in such a case the operation will succeed.

1. No, dereferencing a null pointer will not "cause a sigsegv". It causes UB. In practice, in unix user space, yes it'll probably be SIGSEGV. 2. A null pointer is not a valid pointer: Yeah… Once again my question was "But when would it not be a valid pointer, and yet also not a null pointer? A null pointer we can check for easily."

This code will NEVER deference a null pointer. Not under any compiler, not with any compiler options:

    if (ptr != NULL) { *ptr = 0; }
> A null pointer is not a valid pointer in a predominant number of systems in existence.

No, that's not quite pedantically accurate. A null pointer is not a valid pointer in the C programming language. Address zero may or may not be, that's outside the scope of the C language. Which is why embedded and kernel work sometimes has to be very careful here.

> They may not have a MMU, and in such a case the operation will succeed.

Lack of MMU does not mean address zero is valid. It definitely* doesn't make a null pointer valid. In fact, a null pointer may not point to address zero.

A zero (0, not NULL!) pointer is a valid pointer in C/C++. It is not a UB, and it means one simple thing: «give me the contents of a memory cell (a byte, a word, a long word etc) at the address of 0». Old hardware designs used the address of 0 to store a jump address of the system boot-up sequence (i.e. firmware), and I personally wrote the code in C to inspect / use it in the unpriviledged hardware mode.

The prevailing number of modern systems do not map the very first virtual (the emphasis is on virtual) memory page (the one that starts from zero) into the process address space for pragmatic reasons – an attempt to dereference a zero pointer is most assuredly a defect in the application. Therefore, an attempt to dereference a zero pointer always results in a page fault due to the zeroeth memory page not being present in the process' address space, which is always a SIGSEGV in a UNIX.

Embdedded systems that do not have a MMU will allow *ptr where «ptr» is zero to proceed happily. Some (not all) systems may even have a system specific or a device register mapped at the address being 0.

You are conflating several unrelated things, and there is no pedantry involved – it is a very simple matter with nothing else to debate.

> it means one simple thing: «give me the contents of a memory cell (a byte, a word, a long word etc) at the address of 0»

Well… sometimes. If you set a pointer to literal 0, you do not actually make that pointer point to address zero, from the C language's point of view. No, you are then setting it to be the null pointer. (c99 6.3.2.3 paragraph 3)

Now, what is the bit value of a null pointer? That's undefined.

So how do you even set a pointer to point to address zero? In the C standard, maybe if you set an intptr_t to 0 and then cast it to the pointer? Actually I don't know how null pointer interacts with intptr_t 0. Is intptr_t even guaranteed to contain the same bit pattern? I don't see it. All I see is that it's guaranteed to convert back and forth without loss. For all I can find in the spec, converting between intptr_t and pointer inverts the bits.

A null pointer "is guaranteed to compare unequal to a pointer to any object or function".

Did you put an object or function at address zero? Sounds pretty UB to me.

> modern systems […] SEGV

I already agreed with you on this. I mean… now modern systems don't let applications map address zero (actually, is that always true? I know OpenBSD stopped allowing it after some security holes. I'm too lazy to check if Linux did too)

More info at https://stackoverflow.com/questions/63790813/allocating-addr...

In any case, this is a fix that's only like 10 years old (or I'm old and it's actually 20). It used to be possible.

> Embdedded systems that do not have a MMU will allow *ptr where «ptr» is zero to proceed happily.

This is absolutely not true. An embedded system could have I/O mapped to address zero reboot the machine on read or write. And that'd be perfectly fine for the C language spec, since C doesn't allow dereferencing a null pointer.

MMU is not the only way memory becomes magic. In fact, it's probably the LEAST of the magic memory mapping that can happen.

> with nothing else to debate.

I mean… you're just wrong. I'm not conflating unrelated things. I'm correcting multiple unrelated mis-statements you made.

To add the things up though: Let's say you intend to read from address zero, so you do `char* ptr = 0; something(*ptr);`. C standard would allow this to set ptr to 0xffff, and reading from that address starts the motor. The C standard doesn't say. It just says that assigning 0 sets it to null pointer, which on some systems is 0xffff.

I've certainly worked on embedded stuff that "did stuff" when an address was read. Sometimes because nobody hooked up the R/W pin, because why would they if the address goes to a motor where "read" doesn't mean anything anyway?

The "ptr" is a pointer to pointer, not just a pointer, you are not dereferencing Null ptr, so i expect nothing to crash.
No, because optimizing compilers are free to elide the check. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#ind...
I'm not quite familar with this flag, but this

>so that if a pointer is checked after it has already been dereferenced, it cannot be null.

sound to me that if i've never deref the pointer anytime before(e.g the null check is at the beginning of function), the compiler won't remove this check.

Since the compiler will merge/fold what it appears to be a different logic sections of your code into a single one, you can never be sure what the release build codegen looks like unless you read the assembly.
If you check for null pointer before you dereference, then no the compiler cannot elide the check.

If you check after dereferencing it, yes it can. But in this case why would you not check before dereferencing? It's the only UB-free choice.

Yes, it can. Why would you be checking the pointer for nullptr after you have dereferenced it? It makes no sense at all, so, compiler indeed can elide the nullptr check before dereferencing the ptr exactly because it is free to _always_ assume that the program is free of UB.

To be more precise GCC says "eliminate useless checks for null pointers" and what I am saying that you can never be sure what in your code ended up being "useless check" vs "useful check" according to the GCC dataflow analysis.

Linux kernel is a famous example for disabling this code transformation because it is considered harmful. And there's nothing harmful with the nullptr check from your example.

> Why would you be checking the pointer for nullptr after you have dereferenced it? It makes no sense at all

Right. It's UB. And that's why the optimization in question is about removing that check. The only reason the optimization is valid for a C compiler to do, is that it can assume dereferencing a null pointer lands you in UB land.

I'm sorry, either you are terrible at trying to explain things, or you have thoroughly misunderstood what all this is about. GCC cannot, under any circumstances or with any flags, remove an "if (ptr == NULL)" that happens before dereferencing the pointer.

What this flag is about, and what the kernel bug you mentioned (at least I think you're referring to this one) is about, was a bug that went "int foo = ptr->some_field; […] if (ptr == NULL) { return -EINVAL; }". And GCC removed the post-deref null pointer check, thus making the bug exploitable.

From the help text:

> if a pointer is checked after it has already been dereferenced, it cannot be null.

after. Only applies after. A check before dereferencing can never be removed by the compiler.

Obviously.

> Yes, it can.

I don't think so. If it could, then this code would reliably crash:

    char *mystr = strdup (oldstr);
    if (mystr)
        *mystr = 0; // Truncate string
That never crashes.
Not on all platforms! If you’re writing portable code targeting a lot of embedded platforms then you don’t want to rely on this optimization.
It's a platform-agnostic optimization in case of GCC so if your embedded Linux toolchain is based on GCC, and most of them are, it's pretty much the case that it will have this optimization turned on by default.

> This option is enabled by default on most targets. On AVR and MSP430, this option is completely disabled.

Yes and if you’re targeting AVR, an extremely popular 8 bit micro, then it’ll be turned off.
Yes, but `*ptr == NULL` is just plain wrong ... it should be `ptr == NULL` ... but that test is redundant since `free` is required to do it.
> can we just do `if(*ptr == NULL) return;` ?

No, certainly not, but you can do

`if(ptr == NULL) return;`

which is correct but unnecessary since `free` is required to do that check.