Hacker News new | ask | show | jobs
by s28l 1318 days ago
1. It is a compiler defined function (`__builtin_unreachable()`), but the issue is that MSVC doesn't have it, so you need a different implementation per compiler [0]. Plus, if a new compiler shows up (besides MSVC/GCC/LLVM), you'd need to investigate what the correct way to express `__builtin_unreachable` is.

From a compiler perspective, using a function makes the most sense, since that fits into the existing control-flow analysis that the compiler will do. Pragmas are processed by the pre-processor, so they aren't appropriate for expressing control flow hints.

2. `std::to_underlying(t)` is a wrapper around `static_cast<std::underlying_type_t<std::remove_cv_t<std::remove_reference_t<decltype(t)>>>>(t)`. So a lot fewer characters. It's also useful since `std::underlying_type_t<T>` behaves weirdly if `T` is not an enum type.

I think you are maybe missing the context that C++ allows the representation of an enum to be defined, e.g. `enum class X : unsigned char {};` vs `enum class Y : unsigned long long {};`. So you can't always cast to `int`. Technically, this isn't the case in C either: the type defaults to `int`, but the compiler will pick a larger type if necessary, e.g. `enum Z { a = ((long long)INT_MAX) + 1 };`

3. `htonl` are not standardized, so they were not part of the C++ standard library. Also, on Windows, I believe you'd need to include `winsock.h` to get access to them, which has its own idiosyncratic issues. You are also missing the context of C++ defining operator overloading, so you can call `std::byteswap(0ull)` and get an `unsigned long long` and you can call `std::byteswap(std::uint16_t{0})` and get a 16 bit unsigned integer.

[0]: https://stackoverflow.com/questions/60802864/emulating-gccs-...

8 comments

> 3. `htonl` are not standardized, so they were not part of the C++ standard library.

I was questioning the motivation of this to facilitate network-byte-order issues as stated in the blog post. The proposal[1] (that the post linked to) also confirmed that the motivation was to expose more machine code intrinsics rather than deal with network byte order like I expected.

Concerning 1.) yeah, I guess I'd prefer a #pragma aesthetically, but didn't think about the fact that it wouldn't be exposed to the compiler.

Thanks for the well thought out reply!

[1] : https://isocpp.org/files/papers/P1272R4.html#motivation

one easily overlooked use of hton ntoh is portable binary persistent data. you want to read and write binary files in different cpu architecture hosts? mount a pluggable "disk" from different devices? use hton and ntoh to write/read binary data...
> Technically, this isn't the case in C either: the type defaults to `int`, but the compiler will pick a larger type if necessary, e.g. `enum Z { a = ((long long)INT_MAX) + 1 };`

if you read this and alarm bells didn't ring in your head I really invite you to immediately go check any enum you may have defined in your code because this is absolutely false with MSVC in C++ (https://gcc.godbolt.org/z/6bqW9rE81) (and C is often compiled as C++ on windows)

In practice I don't think that the latitude of making enum size depend on the enumeration has ever been used as it would be too easy to break the ABI. IIRC compilers have allowed forward declaring enums as an extension before c++11 which obviously doesn't work if the size depends on the definition.
I don't think that's true. Neither C nor C++ would allow you to invoke a function with an incomplete type, regardless of whether it is a `struct` or an `enum`, so there's no ABI issues with forward declaring them, but you wouldn't be able to define or invoke a function that takes an incomplete type argument by value.

    struct A;
    enum B; // as you pointed out, not allowed by the C++ standard

    // fine to declare, define, and invoke
    void fooA(struct A*);
    void fooB(enum B*);

    // ok to forward declare, but you can't call them
    void barA(struct A);
    void barB(enum B);
Well I misremembered it seems. You can use forward declared enum class or enum : <type> in C++ as per standard:

   enum class A; // defaults to int
   enum B :int;
   enum C;
   fooA(A);
   FooB(B);
   FooC(C);

   fooA((A)0);
   fooB((B)0);
   fooC((C)0); // error
but indeed GCC refuses plain enum. Forward declaring enums in C is an old GCC extension that still deosn't allow to pass them when incomplete;

Enums in GCC do always default to int or unsigned int, unless -fshot-enums is used which is ABI breaking.

Probably I was misremembering this combination of the int ABI and the forward declaring extension.

>Pragmas are processed by the pre-processor

This is wrong a lot of pragmas, I would even say most pragmas are not handled by preprocessor. Some examples: pragma pack, warning control, pragma GCC unroll, per function optimization setting changes, all the pragmas which 1:1 map to c++11 style attributes. None of that is handled by preprocessor, pragma once seems like rare one which is. Yes all of them are compiler specific, but handling compiler specific behavior is one of the main purpose of pragmas.

I meant that they are processed before any syntactical analysis, so they're not particularly useful for this kind of thing. For example, these are all wrong usages of our hypothetical `#pragma unreachable`:

    void foo();
    #pragma unreachable
    class bar
    {
    #pragma unreachable
    };
    namespace foobar {
    #pragma unreachable
    }
But pragmas can generally be used anywhere, barring tokenization issues. The end result is that `pragma unreachable` would likely end up turning into a magic function call inside the compiler, since it really only makes sense in a spot where you can invoke a function.

Also, I think you are conflating pragmas with `__attribute__`, which is how you set per function optimization settings. If you do that with pragmas, then it isn't limited to a single function.

This is the intent and purpose of _Pragma; it provides a way to use existing #pragmas that are tokenized and handled a bit later, so they can e.g. be included in macro expansions.
I think the only benefit of `_Pragma` is that it enables defining a macro that expands into a `#pragma` definition. I don't think there's any other use case beyond that.
If pragma works for omp for it would work for unreachable. But it is just ugly and there really wouldn't be any reason to use it here.
> From a compiler perspective, using a function makes the most sense, since that fits into the existing control-flow analysis that the compiler will do. Pragmas are processed by the pre-processor, so they aren't appropriate for expressing control flow hints.

This seems par for the course for all C++ stuff: it's designed from a compiler perspective, not from a programmer perspective.

The correct way, IMHO, is to design features that supports the user's workflow, not to design the same feature in a way to make the compiler's job easier.

> `htonl` are not standardized, so they were not part of the C++ standard library.

They're POSIX standardised. The decision should have been to adopt something that exists in an existing and widespread standard rather than the worrisome not-invented-here syndrome that I see here.

Yeah, but that's platform-specific. It's not part of the C or the C++ standard library.
> Pragmas are processed by the pre-processor, so they aren't appropriate for expressing control flow hints.

I don't thing this is remotely true. C++ pragmas were designed with the express purpose of providing additional information to compilers.

Attributes are better suited for that. #pragma has always just been a grandfathered in hack.
> Attributes are better suited for that. #pragma has always just been a grandfathered in hack.

I'm not so sure attributes are better. They have political traction, but that does not mean better. All major compilers use pragmas effectively to implement custom compiler flags. See for instance how Visual C++ uses pragmas extensively to toggle specific compiler warnings, not to mention the infamous #pragma once

> You are also missing the context of C++ defining operator overloading, so you can call `std::byteswap(0ull)` and get an `unsigned long long` and you can call `std::byteswap(std::uint16_t{0})` and get a 16 bit unsigned integer.

I can believe this is useful in explicitly-typed form, i.e. using std::byteswap<T> with T specified. But letting T be inferred seems quite dangerous: C++ loves changing integer types around all by itself (via type promotion, for example), and byteswap<int> and byteswap<long> are (on UNIXy systems) simply not the same operation. For that matter, byteswap should really only be used on uintN_t.

byteswap(a+b) is just asking for trouble.

> Plus, if a new compiler shows up (besides MSVC/GCC/LLVM)

They are still far from being the only game in town.

That was unclear on my part, but I meant a new compiler in terms of "new to the project", not "new to the C++ community". The linked SO answer only covers those three compilers, which illustrates the issue I was talking about (you need to add a new `#elif` branch)