This absolutely works, but not quite the way GP described. Instead, unit motions on the bayer pattern are used, to the effect that every logical subpixel is sampled by every colour channel (therefore giving you a full pixel). Hence, no demosaicing is required. Hence, higher spatial frequencies can be maintained without incurring aliasing.
Camera lenses curve light, offsetting on a micro scale produces a slightly different perspective. You would end up with a blurrier image, not a high resolution one.