Hacker News new | ask | show | jobs
by hansvm 1976 days ago
Not only is there a rough order, they explicitly require the ability to meaningfully embed data into the reals, and the performance gains come from assuming those embeddings have simple delta distributions. I wouldn't be surprised if the technique is worse than useless when that assumption is violated.

Edit: I don't have time right now, but a toy example I like to throw at these kinds of problems is mapping primes to their indices (e.g. 2->0, 3->1, 5->2, ...). General-purpose learning algorithms can usually make a little headway with it, but not much, and only with substantial resources thrown at the problem. I'd be shocked if that toy example were any faster with their solution than a b-tree.

1 comments

Due to the prime number theorem, your toy example actually has a very good approximated mapping.
Kind of. Appropriately normalizing the primes with some kind of function based on log(log(n)) would probably allow the article's technique to perform well (still struggling on larger primes -- you can't avoid needing a lot of bits to express those gaps), but it's precisely the shifting density which makes the problem hard to learn with any standard algorithm, and I don't think theirs would be an exception since they explicitly rely on gaps drawn from some fixed distribution.