Hacker News new | ask | show | jobs
by lightcatcher 4692 days ago
> Tracking blank white objects — be it a piece of paper, or a big blank wall — is one of the hardest computer vision challenges around.

Most of what I know about computer vision comes from deep learning approaches, but tracking a white object doesn't seem like it should be too difficult. Is tracking a large white object actually "one of the hardest computer vision challenges", or is this just a garbage quote?

3 comments

Tracking within a plain white object is very hard. Plain coloured walls are featureless, so there's nothing for most algorithms to latch onto.

However, tracking a white object such as a piece of paper sitting on a contrasting desk is relatively easy. Especially if your algorithm is designed to handle such a case. You have the easily detectable corners and edges of the paper, and from that you can infer its transformation. You can also detect its soft deformation (such as bending or crumpling the paper) if your system is assuming a piece of paper as the model.

The way some tracking works is to use a corner detector to find "interesting" features. A naive tracking algorithm will then examine the spatial neighbourhood of each feature in the next frame in order to find out where it has moved to.

There are better feature representations (such as SIFT) which define a "feature" in an image in such a way as to be scale and rotation invariant (you can match the feature against scaled and transformed versions of itself). There are also much better ways to track across frames of video data.

Given that Meta has infrared and RGB stereo cameras it has a lot more information to work with. I hope they can make it work well under all situations, but I am skeptical.

Thank you for the explanation. I was mostly thinking about the case of tracking something like a piece of paper in front of a wooden desk.

I can see how tracking the scale and orientation of field of view filling single color objects would be difficult/impossible.

It doesn't seem like these worst case scenarios would come up much in real world use. It's fairly rare to encounter situations where one's entire field of view is filled with one (featureless) color. I would image that a wide field of view for the cameras would help greatly with this problem.

Without any texture it's practically impossible to locate and track any feature points on the wall. Most vision algorithms use "corners" (feature points) that can be matched/tracked.

A way to get around that is to use an infrared setup like the kinect to project a pattern onto object, but I'm pretty sure that wouldn't work if both the projector and the object are moving.

No, it works for moving projectors as long as the camera is moving as well.

http://www.youtube.com/watch?v=CSBDY0RuhS4 http://www.youtube.com/watch?v=Sw4RvwhQ73E

since you could assume that the projector is standing still and everything else is moving, the same should be true for the inverse and all points inbetween.

It depends - tracking a white piece of paper on a snowy backdrop is definitely a hard challenge ;)

But all in all, I wouldn't say it is. In undergrad the final project of my computer vision class was to track a soccer ball over video frames. White circular object against mostly green backdrop- fairly straightforward.

I'm pretty sure what Meron meant was getting the orientation of walls so that they can project stuff on top of them.