| I was about to say "the example in the post is sorting so it's n * log(n)". Upon thinking about it, the example is comparing the relative speed of hashing every item (O(n) * O(1)) == O(n) and sorting (O(n * log(n)). That means the ratio of these two is a log n factor n * log(n) / n == log(n). good point. As a side point, I think the main point of the post is that the time taken for the hash solution, increases at a greater rate than the sort + unique solution. Now, that doesn't make sense with our big 0 notation. This is because we don't have a simple hashmap, what we have is a dynamically resizing hashmap, which we don't set a capacity on, and the resize operations are order n. Now, how often can we expect to resize? I think most hashmaps double their capacity, so there will be log(n) resize operations that each take O(m) (where m is the size of the hashmap when it needs to be resized). Now at this point, I can't be bothered to think further about it. That feels like it should be less than O(n * log(n)), but it's kinda close. Either way, it's definitely larger than the simpler O(n) case we're thinking about. |
If there are deletions, both operations are amortised O(1) provided they have been sensible and made the deletion threshold a lower factor.
It's the same sort of pattern as dynamically resizing a growing string. As long as you resize by a constant > 1 factor, rather than an addition, the number of resizes is low enough that the O(n) cost of copying is amortised over prior appends.
> Either way, it's definitely larger than the simpler O(n) case we're thinking about.
It actually isn't larger than O(n), setting aside constant factors.
For a hash map's time to grow more than O(n) for n insertions starting from empty, suggests an implementation problem.