Hacker News new | ask | show | jobs
by grotorea 1030 days ago
I wonder what's the best solution here then. A different language that actually is portable assembly, or has less undefined behaviour or simpler semantics (e.g RIIR), or making -O0 behave as portable assembly?
1 comments

Step 1: Define just what "portable assembly" actually means.

An assembly program specifies a sequence of CPU instructions. You can't do that in a higher-level language.

Perhaps you could define a C-like language with a more straightforward abstract machine. What would such a language say about the behavior of integer overflow, or dereferencing a null pointer, or writing outside the bounds of an array object?

You could resolve some of those things by adding mandatory run-time checks, but then you have a language that's at a higher level than C.

> Perhaps you could define a C-like language with a more straightforward abstract machine. What would such a language say about the behavior of integer overflow

Whatever the CPU does. Eg, on x86, twos complement.

> or dereferencing a null pointer

Whatever the CPU does. Eg, on X86/Linux in userspace, it segfaults 100% predictably.

> or writing outside the bounds of an array object?

Whatever the CPU does. Eg, on X86/Linux, write to whatever is next in memory, or segfault.

> You could resolve some of those things by adding mandatory run-time checks, but then you have a language that's at a higher level than C.

No checks needed. Since we're talking about "portable assembly", we're talking about translating to assembly in the most direct manner possible. So dereferencing a NULL pointer literally reads from address 0x0.

> What would such a language say about the behavior of integer overflow

Two's complement (i.e. the result which is equivalent to the mathematical answer modulo 2^{width})

> dereferencing a null pointer

A load/store instruction to address zero.

> writing outside the bounds of an array object

A store instruction to the corresponding address. It's possible this could overwrite something important on the stack like a return address, in which case the compiler doesn't have to work around this (though if the compiler detects this statically, it should complain rather than treating it as unreachable)

The reason not to define these things is exactly so C can be used as a high-level assembler, and the answer is always “whatever it is that the CPU naturally does”

"Committee did not want to force programmers into writing portably, to preclude the use of C as a “high-level assembler:”

https://www.open-std.org/JTC1/SC22/WG14/www/docs/n897.pdf

p10, line 39

"C code can be portable. "

line 30