|
|
|
|
|
by jandrewrogers
36 days ago
|
|
To give a concrete example, spatial indexing algorithms badly break cache replacement algorithms for intrinsic theory reasons. This has been known since at least the 1980s. We have a literature full of spatial indexing algorithms that can't work in any real system because they assume cache replacement algorithms. This problem isn't even mentioned in modern academic papers. That is extremely low-value research. That's like doing physics research under the assumption that the fundamental laws of physics don't apply. It might be an intellectually interesting exercise but it isn't useful. It isn't all like this. The spatial indexing literature is actively bad to an unusual extent. If you look at e.g. academic graph analytic algorithm research, where I also worked, it is mostly just decades behind the non-academic state-of-the-art. The literature won't mislead you but it also won't tell you where the frontier is. |
|
This article[0] says "Data changes are usually much less frequent than queries, so incurring an initial cost of processing data into an index is a fair price to pay for instant searches afterwards."
Is that what you're referring to with cache replacement? There's no way to quickly update the index?
Wikipedia says "The R-tree was proposed by Antonin Guttman in 1984 and has found significant use in both theoretical and applied contexts."
I'm gathering that there are applications which are ok with that limitation? Generally there is communication and awareness between industry and academia, if everyone really needs cache replacement for it to be useful, why have they not attempted to account for it? What is the cause of the total disconnect you describe?
[0] https://blog.mapbox.com/a-dive-into-spatial-search-algorithm...