Hacker News new | ask | show | jobs
by cybojanek 3980 days ago
Yes you can - take the hash of the input and token, and compare those. Now you're trying to cause a sha256 collision. Good luck!
3 comments

Except that that still leaves you open to a dictionary attack.

Or rather, it accelerates a dictionary attack from O(sizeof(dictionary)) tries to O(length(hash)) tries.

(Let's suppose you are comparing by hex digits. You try 16 dictionary "words" whose hashes start with 0...f. One of those should take slightly longer. You then try 16 dictionary "words" whose hashes start with the digit found previously with different second digits. (Skip any digits that you don't have a word for.) One of those should take slightly longer... Repeat until found.)

This can be mitigated with a salt - if and only if you manage to prevent the user from figuring out how your salting scheme works.

You still have to perform the byte-by-byte comparison and this is where TheLoneWolfling alleges the optimizations kick in and give away secrets. However, from my testing, the variations in timing are unpredictable (I get variations on compare speed on different runs of the same program with the same input) and small and would be swallowed up by variations in network latency.
You could probably do that relatively safely with bit operations, but I expect to be proved wrong.

Something like this should work:

    /* Swap char for uint<register_size> for speed */
    #define CMP_TYPE char

    /* Assuming char* sha256(char* input); */
    CMP_TYPE* input_hash=(CMP_TYPE*) sha256(input);
    CMP_TYPE* token_hash=(CMP_TYPE*) sha256(token); /* Precomputed? */
    /* max_pos=32 for char on most systems */
    int max_pos=256 >> (3 + sizeof(CMP_TYPE));

    char cmp=0;
    for (int i=0;i<max_pos;i++) {
        cmp = cmp | (token_hash[i] ^ input_hash[i]);
    }
    return (cmp != 0);
The reason for the CMP_TYPE #define is so that you can optimise the comparison by replacing char with uint32 or uint64.
Again, the compiler is free to optimize that to data-dependent time. It's relatively unlikely to do so on current machines (the time saved is unlikely to be worth the potential of a pipeline stall), but it's free to do so.

For instance, it's legal for the compiler to insert a strcmp fallthrough (`if input == token: return true`, or rather the strcmp equivalent) at the start, I'm pretty sure.

I have a bunch of other replies in this thread that touch on this, but suffice to say it's not just the comparison that matters.

For one thing, you have to assume that the SHA function is data-independent time (which, again, good luck doing in C / C++).

For another thing, noise in timing attacks doesn't prevent them. Even at levels of noise that seemingly obscure everything. And it's a very bad thing to rely on network latency being unpredictable enough.

Since an optimizing compiler is free to go to town on functions such as string comparison, are these functions typically written in assembler per-architecture? I.e. one for x86 and x86-64, another for MIPS, RISC, etc?
a) You're just punting the timing attack off to the SHA algorithm.

b) There is no SHA algorithm in the C / C++ standard library (that I know of), muchless a guaranteed constant-time (or rather, data-independent) one.

c) The compiler is well within its rights to insert "busy_loop_for_ms(input_char);" anywhere it wishes. It is unlikely to do so, but it is allowed to. As I said: C and C++ don't have any notion of constant or variable time.