|
|
|
|
|
by PeachPlum
3108 days ago
|
|
I understand and agree with your nuance. If I had a store, though, I would count someone walking past one way and then they other as two potential visits. I would imagine that "people in room at time t" and "people in room t+1" is quite a good proxy for "number of people present all day", it is certainly an upper bound. |
|
Once you've done all that, a top end GPU can handle about 4 FPS (assuming 1080P input).
Then the problem is that YOLO will occasionally miss blindingly obvious objects. That combined with the fact that you've only got 4 FPS means that detecting the direction of a person is hard - they tend to move across the camera's field of view before you've got enough data to be confident what just happened. A person walking from left to right looks the same as someone walking off the left of the frame and then someone different walking onto the right of the next frame.
At some point its easier, cheaper and more accurate to install an IR laser beam and count the breaks. You'll save about half a kilowatt too.
Another interesting point is that YOLO is pre-trained on hundreds of object classes. This feels like a waste. I wanted to retrain it with all but the people class removed from the training set. My learned colleague suggested that was a stupid idea because YOLO learns general info about how to separate objects from backgrounds from all the object classes. Not showing it surfboards makes it worse at detecting people. Crazy.