Hacker News new | ask | show | jobs
by hinkley 410 days ago
This reminds me that I need to spend more time thinking about the algorithm the allies used to count German tanks by serial number. The people in the field estimated about 5x as many tanks as were actually produced but the serial number trick was over 90% accurate.
1 comments

It seems like it could have some utility in places where hyperloglog isn’t quite right. YouTube recommendations pointed me at a Numberphile video on this a couple weeks ago:

https://youtube.com/watch?v=WLCwMRJBhuI

An interesting corollary of this is that if you only have a single sample, it reduces to indicating that your sample is the median value - i.e. if you see one item with serial number N, you can guess that there were roughly 2N produced.
You do have outliers though. Seal Team 6 is actually Seal Team 1 but they wanted people to think they were outnumbered.