| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by threatripper 2120 days ago
	How would one accelerate object tracking on a video stream where each frame depends on the result of the previous one? Batching and multi-threading doesn't work here. Are there some CNN-libraries that have way less overhead for small batch sizes? Tensorflow (GPU accelerated) seems to go down from 10000 fps on large batches to 200 fps for single frames for a small CNN.

2 comments

lostdog 2120 days ago

It depends on the algorithm you're using, but here are some places to start:

1. How many times is the data being copied, or moved between devices?

2. Are you recomputing data from previous frames that you could just be saving? For example, some tracking algorithms apply the same CNN tower to the last 3-5 images, and you could just save the results from the last frame instead of recomputing. (Of course, you also want to follow hint #1 and keep these results on the GPU).

3. Change the algorithm or network you're using.

Really you should read the original article carefully. The article is showing you the steps for profiling what part of the runtime is slow. Typically, once you profile a little you'll be surprised to find that time is being wasted somewhere unexpected.

briggers 2120 days ago

Great point - dependencies between frames are inherently problematic for many of these techniques.

Everything lostdog says. I've had experience speeding up tracking immensely using the same big hammer I talk about in the article - moving the larger parts of tracking compute to GPU.

Also, in a tracking pipeline you'll generally have the big compute on pixels done up front. Object detection and ReID take the bulk of the compute and can be easily batched and run in parallel. The results (metadata) can then be fed into a more serial process (but still doing the N<->N ReID comparisons on GPU).