Couldn't this entire class of bug be solved by annotating signal handlers in the source code and checking at compile time that anything called from a signal handler is async-signal-safe?
Sounds reasonable, but since the language layer has no knowledge of signal handlers or what that means, it would be a separation of concerns problem. I'm sure you could get clang to do it, but still a tricky thing to design around.
Ultimately it's an example of an invariant where it's clear that programmers can't be trusted to uphold it. In this case, the consequences can be very significant.
Libc isn't the language runtime. The runtime is '/use/lib/crt*.o', which has no concept at all of signal handling.
Libc isn't particularly intrinsic to the language, and outside of some assembly to make syscalls, you can implement an alternative with a completely different interface, purely in C.
The language standard library does, however, contain explicit support for signal handling, as specified in ISO/IEC 1989:2023 section 7.14 Signal handling <signal.h>. The cross-platform bits of libc are specified in the C standard. The POSIX-specific bits are specified by the Open Group in the POSIX standard. The OS-specific bits are specified by the OS and implemented by whoever is writing the libc in question. A libc is a sort of statically linked combination of the C standard library and some OS-specific standard library extensions.
I am fairly certain that glibc uses SA_RESTORER in its sigaction wrapper and implements a suitable sigreturn() function which is provided as the sa_restorer argument.
Static analysis tools would go a long way here, yes, and it should be a relatively straightforward analysis. You probably don't even need to explicitly annotate signal handlers, just examine arguments to calls to signal() and sigaction().
This entire class of bug could also be solved by avoiding signal handlers. You can still use SIGALRM for a timeout, but don't log it. If you need complex processing, use signalfd to read signals in the event loop.
Well, the compiler has no way of knowing if a function will later be a signal handler after linking, or even dynamic loading.
There is no portable way to annotate all functions ever written or ever will be written as being async signal safe.
Which functions are async signal safe varies with the operating system and runtime (eg. an unsafe function in linux-gnu might be safe in linux-musl or linux-bionic).
Other than those insurmountable problems, yeah, good idea.
What I want when writing a signal handler is to be able to say, this function must be async-signal-safe and therefore all the functions it calls must be async-signal-safe. That can be done purely at compile time; I don’t need to worry about linking.
The annotation does not need to be portable; if it’s present on one system then other systems still benefit because the code is written to pass the check.
The list of async-signal-safe functions is well documented and quite short, so it would not be much work to add the annotations to the header files. It’s OK if some safe functions are omitted, because signal handlers should be written to do the absolute bare minimum.
No, any function can call another function from another translation unit (at link time) or load and call a function from another translation unit (at runtime). How will the compiler enforce the propagation of the requirement in those cases?
> compiler has no way of knowing if a function will later be a signal handler after linking, or even dynamic loading
You could check it at runtime.
Just like with array bounds checking, in many cases the compiler could sometimes prove the runtime check isn't necessary and eliminate it.
> Which functions are async signal safe varies with the operating system and runtime
Annotations could enumerate specific platforms where it is safe or unsafe. Or you could annotate based on specific attributes of platforms that make it safe or unsafe.
> Well, the compiler has no way of knowing if a function will later be a signal handler after linking, or even dynamic loading.
That's why GP suggested annotating them. Typically this would be done via a function attribute.
> There is no portable way to annotate all functions ever written or ever will be written as being async signal safe.
This is not a requirement for such an annotation to exist and to be used by projects that care about security or even just correctness.
> Which functions are async signal safe varies with the operating system and runtime (eg. an unsafe function in linux-gnu might be safe in linux-musl or linux-bionic).
And libc implementations already annotate many of their functions to tell the compiler how they work. Compilers are also more than happy to assume behavior of standard function matches the C/C++ standards in non-freestanding environmnets.
> Other than those insurmountable problems, yeah, good idea.
All fairly trivial problems that have already been solved many times for similar issues.
I'd like a more general attribute though to declare that a particular funcion is in some abstract domain and then annotations that certain functions may or may not be called in certain domains. This could come useful in cases where you want some functions to only be called from special threads.
> That's why GP suggested annotating them. Typically this would be done via a function attribute.
That won't help when you link external functions or worse, dynamically load them. Those are things done long after the compiler has run.
> And libc implementations already annotate many of their functions to tell the compiler how they work. Compilers are also more than happy to assume behavior of standard function matches the C/C++ standards in non-freestanding environmnets.
We're not talking about standard functions here, we're talking about any function any developer could ever call in a signal context. Ever. Like, for example, a libssh shutdown function that invokes a callback that calls a syslog function that does some socket operation on a buffer that some other thread has already freed. Which of those functions needs the annotation, and how does dlsym() deal with it?
Cert's SIG30 rule page has a list: https://wiki.sei.cmu.edu/confluence/display/c/SIG30-C.+Call+...
Also there's https://clang.llvm.org/extra/clang-tidy/checks/bugprone/sign...