|
|
|
|
|
by mgradowski
2015 days ago
|
|
Regarding [1], many OpenCV functions in Python support an optional dst argument, similar to how they do in the C++ API. This makes the memory management situation not completely hopeless. In my experience, the only drawback of the Python API is the fact that it cannot utilize the multithreading module (probably does not release the GIL for long calls). |
|
Good point about OpenCV's in-place operations. Sometimes it's tricky/impossible to do that in numpy if you need to implement something that OpenCV doesn't provide. For example the C code that I linked in my previous comment, we wrote as an optimization of OpenCV's `matchTemplate` when the inputs meet a specific condition (that both input images are the same size). In C we do the multiplications as we iterate over the images and we maintain a rolling sum in a single variable. In numpy you can't really do this, you have to multiply the whole array and then sum the result.
For 720p images our C implementation[1] was 100x faster than our numpy implementation[2], and 10x faster than numba[3].
[1]: https://github.com/stb-tester/stb-tester/blob/v32/_stbt/sqdi...
[2]: https://github.com/stb-tester/stb-tester/pull/566/files#diff...
[3]: https://github.com/stb-tester/stb-tester/pull/566/files#diff...