| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by petschge 2605 days ago
	I do plasma simulations and recently had the problem of finding the distance to the nearest neighbor for every of the particles in the simulation. Doing that naively is O(n^2) and took hours even for small test problems. Building an R-tree once and using if for nearest-neighbor look-ups brought that down to 5 minutes. libspatialindex lacks documentation, but worked really nicely. The rtree interface in python is much friendlier.

4 comments

arman_ashrafian 2604 days ago

I’m taking Advanced Data Structures at UCSD right now and our first assignment was making a K-D Tree and an efficient KNN Classifier. It was surprisingly simple and the efficiency between the KD Tree and brute force implementation was quite drastic.

If you only build the tree once and do no insertions what is the benefit of an R-Tree vs KDTree?

link

petschge 2604 days ago

I actually do plan to update the tree as I insert additional particles in locations where the distance to the nearest neighbor is large.

link

BubRoss 2604 days ago

There is a library called flann that does approximate nearest neighbors using a kd-tree and best bin first method. It is typically used not only for nearest neighbors but finding them in higher dimensions.

link

igotsideas 2604 days ago

Plasma simulation sounds really damn cool.

link

wolfram74 2604 days ago

It's actually usually pretty hot. </pun>

link

amelius 2604 days ago

I'm curious, how many particles do your simulations typically contain?

link

petschge 2604 days ago

Typical full sized simulation runs contain between 100 million (1e8) and 100 billion (1e11) particles. Small tests while debugging might be as small as a few thousand particles (1e4). And tests of the code might only use one or two particles.

The thing I was working on when I switched from naive O(n^2) to R-trees had half a million (5e5) particles.

link