Hacker News new | ask | show | jobs
by briggers 2481 days ago
I see from the MTCNN code that this repo (like all others I've seen) is still bouncing tensors between GPU and CPU while passing between the P/R/ONets.

So many ML repos make this mistake in pre/post-processing and end up bottlenecked on CPU.

Anyone know of an MTCNN that's been ported to run more or less fully on GPU? (Or even that does batching instead of an image-by-image approach?)

2 comments

I have used this before: https://github.com/blaueck/tf-mtcnn which uses tensorflow for all the xNets.

Example in rust: https://cetra3.github.io/blog/face-detection-with-tensorflow...

I'm not aware of any implementation with these features, but they are both on the roadmap for the linked repo. Both should also be achievable. Batch processing, in particular, will be a straight-forward change and should result in quite a speed-up. Although it will require the input images to have the same dimensions.
Good stuff.

In my experience inputs to MTCNN tend to be full frames, so the uniform dimension requirement is usually met.