|
|
|
|
|
by SilasX
4010 days ago
|
|
But those situations aren't the same -- comparison sorts take O(n log n) because it is expressed in terms of n, the number of elements, which is orthogonal to element size. Even if you did account for the max element value V, it would be constant with respect to that value. The full bound would then be n log(V) log n -- regardless of your choice of V, you don't affect the scaling with respect to n. But for hashtables, what exactly are we studying that doesn't care about hash time? The whole reason that a hashtable is supposed to save time is that you locate the desired key's value by computing the location from the key itself. To whatever extent that computation requires more steps, that cannot itself be abstract away -- if only because a limitation on key size is a limitation on table size. |
|
But we really don't care about that very much. In this case.
It's kind of described in that SO post I linked-- you can assume a constant-time hashing operation, but if that offends your conscience, assume an "integer RAM" model where arithmetic operations up to a certain word size w are constant time. Then you can observe that any reasonable w is inconsequential to your problem domain, and go back to treating hashing as a constant-time operation :)
The idea is that the fact that w increases at least as the log of n is shared by all computations modeled in integer RAM, so it's not a useful contribution to analysis at that scale. It's in the model, it's just not generally useful to the model. If you ever find yourself asking, is the bit width relevant? You can do some math and get a straight answer.
Of course real world algorithms often hash strings which complicates the analysis, but that's orthogonal, like my misstep earlier in implying the size of an element rather than the size of n. The mathematical sense in which hashing time must increase with the log of n is a fact of all algorithms that have ns.
I think your surprise at this just stems from wanting a rigorous definition for something that's more of a modeling tool, with different models to choose from. In real life, there's the time on the wall since you clicked "Run", and everything else is theorygramming.