| HN Mirror

Thanks for the link. Very interesting. I found this [1] from 2015.

Reading your cite, the practical issue seems to me to be that the optimizer's memory footprint costs may in fact negate any benefit (e.g. ~40% over LRU) obtained in reducing cache misses.

My gut feeling is that this approach (for online systems) may work best with a hardware component (a card hosting the 'experts' and their virtual model e.g. the "virtual cache"). The distributed variant also seems worth exploring.

[1]: https://arxiv.org/pdf/1403.0388.pdf