Hacker News new | ask | show | jobs
by stochastic_monk 2776 days ago
Thank you! I’ll probably further explore hmh. I have a draft implementation which I’ll compare to related structures soon. A key advantage that hyperminhash has is the ability to sample, if one is using a large enough sampled point and a reversible hash, which could be helpful for certain applications. (Since hll discards all input items)

In fact, I spend 8 bits per entry to facilitate SIMD acceleration, but I’m happy with the speed/memory tradeoff.

A lot of issues are eliminated by using Ertl’s MLE or JMLE formulas, but I do see how sets of combinatorial comparisons could be helped with hmh.