|
|
|
|
|
by tadkar
2036 days ago
|
|
Related to this, one of my favourite articles [1] suggests that it’s sufficient to only use one or two pieces of memory to get good estimates. Here’s a pretty amazing result from the paper. To estimate the median (loosely) on a stream, first set the median estimate m to 0 prior to seeing any data. Then as you observe the stream, increase the median estimate m by 1 if the current element s is bigger than m. Do nothing if the current element is the same as m. Decrease the estimate by 1 if the element is less than m. Then (given the right conditions) this process converges to the median. You can even extend this to quantiles by flipping a biased coin and then updating based on the result of the flip as well as the element comparison. The unfortunately now deceased neustar research blog had an interactive widget as well an outstanding write up[2] [1]https://arxiv.org/abs/1407.1121 [2] http://content.research.neustar.biz/blog/frugal.html |
|