| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by yjftsjthsd-h 1181 days ago
	> The library was already using numpy for a lot of its calculations, so why should we expect Rust to be better? I literally clicked in to read the article to see if they'd mention this:) But... unless I missed it, there wasn't really an answer? I thought numpy does do the heavy lifting in native code, so why is this faster? Does this version just push more of the logic into native code than numpy did?

5 comments

alex_smart 1181 days ago

Numpy is fast when the code is vectorized. The code they are benchmarking against was not vectorized. They wanted to calculated the distances of n points against a given point and find out which points are closer than a threshold (max_dist). Instead of vectorizing the whole operation, the python code was just calling numpy in a loop to find the distance of two points.

Just that small change already gives 10x faster performance without ever leaving python/numpy land.

link

akasakahakada 1181 days ago

> They wanted to calculated the distances of n points against a given point and find out which points are closer than a threshold (max_dist).

Scipy should have already implemented such thing. Scikit-Learn also. Because KNN clustering is exactly doing this kind of work.

link

fbdab103 1181 days ago

They did have a choice quote a bit before the blurb you quoted. The real world problem is sufficiently more complex/different enough that vectorizing it would be a pain.

>It’s worth noting that converting parts of / everything to vectorized numpy might be possible for this toy library, but will be nearly impossible for the real library while making the code much less readable and modifiable...

link

yjftsjthsd-h 1181 days ago

Thanks, I managed to miss that

link

anewhnaccount2 1181 days ago

The article mentioned that there are some gains from using vectorised numpy methods, i.e. spending more time in numpy code. I would be interested in whether the List[Polygon] could be converted into two long arrays of all xs and all ys with indices into the starts (essentially a dense representation of a sparse array) and then the core function rewritten for Numba since it could now not use any Python objects. This would break the interface of course, but may get within striking distance of the Rust implementation.

link

biomcgary 1181 days ago

The slowness comes from the interaction of numpy and a Python object "Polygon", which in not numpy. I suspect that a sufficiently clever coder could have optimized the result without resorting to Rust, but at the cost of a substantial increase in complexity of the codebase. The proposed approach keep the Python code simple (and moves the complexity into having another language to deal with).

link

alex_smart 1181 days ago

    diff --git a/poly_match_v1.py b/poly_match_v1.py
    index 675c88a..4293a46 100644
    --- a/poly_match_v1.py
    +++ b/poly_match_v1.py
    @@ -1,4 +1,5 @@
     from functools import cached_property
    +from itertools import compress
     from typing import List, Tuple
     import numpy as np
     from dataclasses import dataclass
    @@ -56,11 +57,8 @@ def generate_example() -> Tuple[List[Polygon], List[np.array]]:
     def find_close_polygons(
  polygon_subset: List[Polygon], point: np.array, max_dist: float
     ) -> List[Polygon]:
    -    close_polygons = []
    -    for poly in polygon_subset:
    -        if np.linalg.norm(poly.center - point) < max_dist:
    -            close_polygons.append(poly)
    -
    +    dist = np.linalg.norm([poly.center for poly in polygon_subset] - point, axis=1)
    +    close_polygons = list(compress(polygon_subset, dist < max_dist))
  return close_polygons

10x faster than the original, without resorting to native code, and without substantial increase in complexity of code base.

link

mgraupner 1181 days ago

I wish the author would take your suggestion (and other recommendations) here and try it out again. Would be a very interesting follow up to read: "How we speed up Python and simplifying our codebase by removing on more dependency".

link

nossid 1181 days ago

Focusing on speeding up find_close_polygons instead of realizing that you're matching many points against the same set of polygons is also unfortunate, since that function being slow is a red herring. You can create a scipy.spatial.KDTree for example and just query all the points against that.

link

lmeyerov 1181 days ago

People can write slow code in any language ;-) We've had to fire contractors who "wrote" dataframe code but was not vectorized in practice despite repeat request to do so. Same thing for slow code in CUDA.

From a maintenance view, I much prefer folks write vectorized data frames vs numpy or low level bindings, but that comes from having lived with the alternative for a lot longer. All of our exceptions are pretty much slotted for deletion. (Our fast path is single or multi-GPU dataframes in python.) Here's to hoping that one day we'll have dependent types in mypy!

link

yjftsjthsd-h 1181 days ago

> People can write slow code in any language ;-)

That was a lot of my curiosity, because I'm not quite a good enough programmer to know whether the problem is just that the original code is bad or does something super inefficient, and so rewriting it any language will let you make massive improvements.

link