Hacker News new | ask | show | jobs
by floren 2245 days ago
Could you go into the selection criteria for those 3 numbers? I see 2 and 769 on the original MIDAS repo as default number of hash functions and buckets, but without explanation of why they're chosen. The third argument, "m", I don't see explained anywhere.
2 comments

I adapted the code from the original paper from https://github.com/bhatiasiddharth/MIDAS/ .

And from one of the issues(https://github.com/bhatiasiddharth/MIDAS/issues/7#issuecomme...), M can be any reasonable value because it's just used to compute the edge hash (https://github.com/steve0hh/midas/blob/master/edgehash.go#L4...).

Hope it clarifies.

Hi, I'm the author of the MIDAS algorithm. We choose the number of hash functions and bucket according to the maximum error we can tolerate and the false positive probability theoretical guarantee we want. Please refer to the AAAI paper here: https://www.comp.nus.edu.sg/~sbhatia/assets/pdf/midas.pdf Let me know if you need more details.