Hacker News new | ask | show | jobs
by hermitdev 2619 days ago
They lost me when they got into showing how to make the 'dups' function faster. The author definitely either doesn't understand big-O notation or doesn't understand the complexity. Their O(1) implementation is anything but. Likely O(n×log(n)) at best. Also, their brute force implementation is unnecessarily verbose. Want dups?

  from collections import defaultdict
  def dups(seq):
    d = defaultdict(int)
    for x in seq:
      d[x] += 1
    return [k for k, v in d.items() if v > 1]
Assuming Python's defaultdict has O(1) lookup/insertion (which I think it does), this algorithm is a proper O(n) complexity.
2 comments

Hmm. I don't think they claimed to have an O(1) time solution, just O(1) added space. Which, it is, but only because they're counting on the original array's underlying type having enough bits for their sign flipping. It would be as if you used a more compact type for the array elements, and then allocated another bitmap for the range of numbers.

Of course, once we start optimizing how the original array is stored, we may have exceeded the limits of this problem as a teaching exercise :)

As for time, it does seem to be O(n) to me; can you clarify why you think it's nlogn? It may not be particularly fast in practice when compared to other O(n) approaches like the bitmap, but I don't think the complexity is wrong.

Your solution is nice - it actually gives you more information (how many appearances, not just T/F >1 appearance), but it does require more additional space and isn't necessarily faster. I think the bitmap approach would be nicer if you're ok with using more space; the bitmap is essentially a very easy to find perfect hash function due to the unique input constraints.

They didn't say it uses O(1) complexity:

> Analyzing the above approach, we have an algorithm that takes O(n) time and uses O(1) space.