For a course project, I implemented DTAM - which does dense mapping and tracking, without needing any stereo vision, in real time (on a mid-range GPU). It's really amazing.
Unfortunately, I don't know CUDA yet and didn't implement a GPU accelerated version. I can't comment on its feasibility on mobile GPUs but the paper authors reported real-time performance on a GTX-480 + i7 quad-core CPU system.
> DTAM is a system for real-time camera tracking and reconstruction which relies not on feature extraction but dense, every pixel methods. As a single hand-held RGB camera flies over a static scene, we estimate detailed textured depth maps at selected keyframes to produce a surface patchwork with millions of vertices. We use the hundreds of images available in a video stream to improve the quality of a simple photometric data term, and minimise a global spatially regularised energy functional in a novel non-convex optimisation framework. Interleaved, we track the camera's 6DOF motion precisely by frame-rate whole image alignment against the entire dense model. Our algorithms are highly parallelisable throughout and DTAM achieves real-time performance using current commodity GPU hardware. We demonstrate that a dense model permits superior tracking performance under rapid motion compared to a state of the art method using features; and also show the additional usefulness of the dense model for real-time scene interaction in a physics-enhanced augmented reality application.