Hacker News new | ask | show | jobs
by toth 902 days ago
In general, median and mode are much harder than min/avg/max. You can't compute the former with constant memory in one pass (you can do approximate median, but not exact median).

(Here there is a restricted range for the temperature with only 199 possible values (-99.9 to 99.9 with 0.1 increment) so you could do it constant memory, need something like 4*199 bytes per unique place name))

For the sum overflow is not an issue if you use 64-bit integers. Parse everything to integers in tenths of degree and even if all 1 billion rows are 99.9 temperature for same place name (worst possible casE), you are very far from overflowing.

1 comments

Putting this on top comment because I can't edit:

I am silly and wrote 'mode' and shouldn't have :P (wetware error: saw list of 3 items corresponding to leetcode and temperature dataset, my 3 were min/max/average, their 3 are mean/median/mode)