Hacker News new | ask | show | jobs
by fp64 7 days ago
I think the article names hashing as a use-case, which I can somewhat still agree. Operations that only depend on the bytes, I guess. But yeah, most things worth saying about this article have been said here already
3 comments

Sure, if the function is expected to not treat the data as anything but bytes, then it might be acceptable in narrow circumstances.

But in such a case I'd argue FOR the ceremony, as a way of declaring from the API "The input is a sequence of bytes that I won't treat as anything other than a sequence of bytes", and declaring from each and every call site: "This is not a mistake; we really are 'converting' this struct to a series of bytes for this function to consume".

Then anyone auditing the code knows the intent by the shape of the types, and would quickly flag any typecasting shenanigans within the receiver function.

But even then, hashing a struct will rapidly bring you into the land of dragons and fairies. Abandon all hope if you have floats or UTF-8 (which have multiple representations for the same values).

Far better to remain type-aware if you value your sanity.

I agree, the original article is rather questionable. I do not write code like the article advocates for. I would probably go for overloads for each data type I have considered and tested, or maybe something fully templated, or std::span/boost::span (hash function is, interesting enough, the very example boost docs give to illustrate boost::span).
A more immediate concern for hashing by treating a struct as a bag of bytes is padding.
Hashing everything based on the byte representation breaks when you have a type where equality does not imply byte equality. Such as… floats (+0 and -0 are equal, but have different byte representation).
Depends on the use-case, hashing can also be used for checking integrity/change in which case you exactly want the behavior that only bit-exact-equality is desired, even for arbitrary structs. Maybe that's somewhat niche, I mention it as I have such a use-case actually.
Even then, accepting a uint8_t* would make this intent clearer.