Hacker News new | ask | show | jobs
by joejoint 2837 days ago
Why not just hash the string using constexpr ?

====================

C preprocessor based compile time hash from lolengine:

http://lolengine.net/blog/2011/12/20/cpp-constant-string-has...

====================

#define H1(s,i,x)(x65599u+(uint8_t)s[(i) <strlen(s)?strlen(s)-1-(i):strlen(s)])

#define H4(s,i,x) H1(s,i,H1(s,i+1,H1(s,i+2,H1(s,i+3,x))))

#define H16(s,i,x) H4(s,i,H4(s,i+4,H4(s,i+8,H4(s,i+12,x))))

#define H64(s,i,x) H16(s,i,H16(s,i+16,H16(s,i+32,H16(s,i+48,x))))

#define H256(s,i,x) H64(s,i,H64(s,i+64,H64(s,i+128,H64(s,i+192,x))))

#define HASH(s) ((uint32_t)(H256(s,0,0)^(H256(s,0,0)>>16)))

template<size_t N> constexpr uint32_t h(char const (&s)[N]) { return HASH(s); }

constexpr uint32_t h(const char s) { return HASH(s); }

uint32_t h(const std::string& s) { return h(s.c_str()); }

int main(int argc, const char argv) {

    auto s = "keep";
    std::string x = "simple";

    switch(h(x))
    {
        case h("keep")  : std::cout << "keep"; break;
        case h("it")    : std::cout << "it"; break;
        case h("simple"): std::cout << "simple"; break;

    };
3 comments

Hash collisions would be one obvious problem with this approach. You could likely always tweak the hash used in response to collisions on the set of constants, but that's not going to be fun the first time it happens.

Also with that approach since it can't be made into a jump table it's still going to compile into a bunch of if/else combinations. At least with the trie it can early-return if the input doesn't have any chance of matching anything.

This is what I have done. However, you still have the risk of collisions. Depending on the context, like i did a small test and permuted all letters a-z and numbers and there are no collisions. However, changing a-z to A-Z resulted in many(may need to check code). However, this allows for a 5 character command switch.
The hash has to look at all the chars in the string.

Does the trie have to as well?

The string compare has a lot of branching and looks at both strings. The hash method one checks the hash of the test string(size_t) with the size_t of the other string. The hash itself, say fnv1a, can be very quick(a xor and a multiply I think per character) I think I remember testing this and it was quicker. ill try to find the test, but it is easily done. Either way, it's the only way to use a switch statement this way
If it’s possible for the string to not be in the set then you have to look at all chars.

Otherwise there are some cases you can skip characters. I generally don’t use this optimization when writing Tries because it’s tricky and rarely makes a difference.