Hacker News new | ask | show | jobs
by mgradowski 2015 days ago
Regarding [1], many OpenCV functions in Python support an optional dst argument, similar to how they do in the C++ API. This makes the memory management situation not completely hopeless.

In my experience, the only drawback of the Python API is the fact that it cannot utilize the multithreading module (probably does not release the GIL for long calls).

1 comments

I believe many (most?) opencv & numpy operations release the GIL.

Good point about OpenCV's in-place operations. Sometimes it's tricky/impossible to do that in numpy if you need to implement something that OpenCV doesn't provide. For example the C code that I linked in my previous comment, we wrote as an optimization of OpenCV's `matchTemplate` when the inputs meet a specific condition (that both input images are the same size). In C we do the multiplications as we iterate over the images and we maintain a rolling sum in a single variable. In numpy you can't really do this, you have to multiply the whole array and then sum the result.

For 720p images our C implementation[1] was 100x faster than our numpy implementation[2], and 10x faster than numba[3].

[1]: https://github.com/stb-tester/stb-tester/blob/v32/_stbt/sqdi...

[2]: https://github.com/stb-tester/stb-tester/pull/566/files#diff...

[3]: https://github.com/stb-tester/stb-tester/pull/566/files#diff...

That's a nice and tidy codebase, real pleasure to read.

> I believe many (most?) opencv & numpy operations release the GIL.

Any idea how I can determine this? I am prototyping a real time machine vision application targeting 2x720p@240fps and I want to avoid writing any C++ for as long as possible.

I don't know much about Python's C API, but this is the line in the OpenCV Python bindings that drops the GIL: https://github.com/opencv/opencv/blob/4.5.0/modules/python/s...

(see https://docs.python.org/3/c-api/init.html#releasing-the-gil-... in the Python C API manual).

That's only called from the ERRWRAP2 macro here: https://github.com/opencv/opencv/blob/4.5.0/modules/python/s...

That macro, in turn, is called from gen_template_func_body in gen2.py: https://github.com/opencv/opencv/blob/4.5.0/modules/python/s...

And that seems to be a code generator that is generating the bindings for all the OpenCV functions: https://github.com/opencv/opencv/blob/4.5.0/modules/python/s...

As to testing this, on Linux I'd run `atop` to see if your program is using all available CPUs.

Nice spelunking! If there is hope of avoiding C++ altogether, then I'll try testing it again on something beefier than my laptop. Thanks for taking the time and effort.