Last I checked, they pretty much don't recover from page faults, they just abort whatever "program" you're trying to run, so you can't really use the MMU for clever things like demand paging. But that's not the point of the GPU's MMU in the first place.
> Last I checked, they pretty much don't recover from page faults, they just abort whatever "program" you're trying to run
Correct. When the GPU page faults, it causes a CPU interrupt and the driver will handle the interrupt. It's not possible to resume execution on a GPU in a timely manner so the only option is to terminate the process that caused the page fault.
> so you can't really use the MMU for clever things like demand paging
Recent GPU generations support "sparse" or "tiled" memory where the GPU can detect if a load or a store would access non-resident memory and then act accordingly. This requires a specialized shader and some CPU-side logic to actually stream in the memory. This can be used to on-demand paging for textures and buffers as well as implement workarounds to reduce visual artifacts from streaming.