When your eyes look at an object in 3D space, they not only swivel so both are looking at it at the same time (convergence), but your eye muscles manipulate the lenses so the object is clear and not blurry (focus). Think of how a camera focusses it's lens to bring an object into clear sharp view.
So in the case of the Oculus, the lenses in your eyes remain focussed on the images on the screens right in front of you, rather than changing depending on how far away the object is in the virtual scene. So there's a disparity there that causes your brain to go wtf and stops it from fully accepting what it's seeing as real.
One way to think of it: When you take a picture with a camera, different objects in the photo may be in or out of focus, depending on how far away from the camera they are.
Closer objects may be sharp and distant objects blurry, or vice versa. Or somewhere in between.
Of course you can aim the camera one direction or another to choose its field of view - which objects appear in the frame and where - but that's completely separate from which of those objects in focus.
Focusing is one thing, aiming the camera is another.
And that doesn't change at all when you have two cameras. You can aim them both at the same thing, you can aim them off into the distance, but you still have to focus them both.
When I look at something far away, close things are out of focus. When I look at close things, distance is out of focus. When I watch a 3D movie, I just align the glasses and things are focused according to the camera that shot it.
So in the case of the Oculus, the lenses in your eyes remain focussed on the images on the screens right in front of you, rather than changing depending on how far away the object is in the virtual scene. So there's a disparity there that causes your brain to go wtf and stops it from fully accepting what it's seeing as real.