| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by zelphirkalt 699 days ago

You can simply pass once over the data, and while you do that, count occurrences of the elements, memorizing the last maximum. Whenever an element is counted, you check, if that count is now higher than the previous maximum. If it is, you memorize the element and its count as the maximum, of course. Very simple approach and linear in time, with minimal book keeping on the way (only the median element and the count (previous max)).

I don't find it surprising or special at all, that finding the median works in linear time, since even this ad-hoc thought of way is in linear time.

EDIT: Ah right, I mixed up mode and median. My bad.

1 comments

gcr 699 days ago

This finds the mode (most common element), not the median.

Wouldn't you also need to keep track of all element counts with your approach? You can't keep the count of only the second-most-common element because you don't know what that is yet.

link

zelphirkalt 699 days ago

Yes, you are right. I mixed up mode and median.

And yes, one would need to keep track of at least a key for each element (not a huge element, if they are somehow huge). But that would be about space complexity.

link

gcr 699 days ago

pardon! it's fun to think about though!

link