Hacker News new | ask | show | jobs
by matheusmoreira 886 days ago
> Better be rolling your own compiler in that case.

No need for that.

> the pointer is assumed to be non-null

Just give us an option to tell the compiler to stop assuming nonsense like that. I'm gonna make it standard on my makefiles just like -fno-strict-aliasing and -fwrapv.

There's no use trying to work around C standard problems. Compilers should just be told to define the undefined and to disable everything that can't be defined. Then we can write code on solid foundations instead of quicksand.

> Or your own memcpy with a different name.

I wish. I couldn't escape that function even on my freestanding nolibc project. The compilers will happily emit calls to memcpy and memset all by themselves whenever they feel like it and god help you if you don't provide them because for some reason this nonsense can't be disabled.

1 comments

LLVM's handling of libc is roughly "assume libc always exists and is linked as machine code". This is deeply unhelpful when that is not true, such as when you're implementing libc. -ffreestanding and -fno-builtins (might be spelled differently) should kill the pattern match into memcpy/memset logic, if it doesn't we have another bug.

I don't trust the clang -fno-strict-aliasing -fno-pointer-whatever strategy. There's too many ways for that to go wrong. Code needs to be correct/safe by default and opt into optimisations to have a chance of working, otherwise it is really easy to fail to check that flag.

There are a few fairly simple C compilers out there. LCC, the one that derives from the obfuscated project, one in gnu mes associated with guix. There's a grammar from the compcert people.

I haven't convinced myself writing a working C compiler is a weekend project but it's surely less than a year, seriously considering it on paranoia grounds. Idea being use it as a reference - when I suspect clang to be breaking things, run against the dumb one that doesn't really do optimisations as a comparison.

> -ffreestanding and -fno-builtins (might be spelled differently) should kill the pattern match into memcpy/memset logic, if it doesn't we have another bug.

Not only does it not kill that pattern, gcc preempts that bug report by documenting it.

It forces us to link to some weird libgcc.a thing too.

https://gcc.gnu.org/onlinedocs/gcc/Link-Options.html

> The compiler may generate calls to memcmp, memset, memcpy and memmove.

> These entry points should be supplied through some other mechanism when this option is specified.

> In most cases, you need libgcc.a even when you want to avoid other standard libraries.

I can't even find clang's documentation for the nostdlib option. Not sure that documentation even exists. I only found documentation for nostdlib++ which suggests they don't really care about C and the things people use it for.

There's just no way to get these compilers to generate a clean binary with only the symbols provided in the source code and without any of this external stuff. Even with all these options I might run readelf on my binary and find a ton of random double underscore gcc stuff in there.

The whole point of writing a freestanding project was to escape from the libc nonsense which actually makes C a much better language but then the compiler just forces all those things back in. Extremely frustrating!!

> I don't trust the clang -fno-strict-aliasing -fno-pointer-whatever strategy. There's too many ways for that to go wrong.

Well that strategy seems to be good enough for the Linux kernel.

https://lwn.net/Articles/316126/

https://lkml.org/lkml/2003/2/26/158

It should be no-builtins for disabling the pattern match in clang and things like no-builtin-memset to disable them individually. However that'll be implemented with branches in the backend codegen and it's totally plausible that'll be buggy.

That at least somewhat works because otherwise the memset implementation in libc is prone to being optimised to a call to memset, and thus to a self call which doesn't terminate, and thus to undef.

I've been bitten by this nonsense on two back ends now, really should move it up the work queue. I think there's an active discussion on discourse about making magic libc functions less hazardous.

Can't comment on gcc. Bare metal broadly works on GPUs on clang. X86 is moderately likely to emit calls to memcpy despite flags, but that would be a bug.

The Linux kernel is tested rather more aggressively than my code but my fear of compiler bugs remains firmly established. Occupational hazard of working on compilers - most bugs I see are in the toolchain, as that's where I'm looking.