> To maybe ask a better way: will this practically help me train my DNN faster?
Probably not. It will only help you if you previously needed to train it on a CPU because you were in a VM, but this seems unlikely. It will not speed up your existing GPU in any way compared to simply using it bare-metal right now.
> Or if I'm a cloud vendor, will this allow me to deploy cheaper GPU for my users?
Yes. This ports a feature from the XXXX$-range of GPUs to the XXX$-range of GPUs. Since the performance of those is similar or nearly similar, you can save a lot of money this way. It will also make the entry costs to the market lower (i.e. now a hypervisor could be sub-1k$, if you go for cheap parts).
On the other hand, a business selling GPU time to customer will probably not want to rely on a hack (especially since there's a good chance it's violating NVidias license), so unless you're building your on HW, your bill will probably not drop. But if you're an ML startup or a hobbyist, you can now cheap out on/actually afford this kind of setup.
The host still wants access to the GPU to do stuff like compositing windows and H.265 encode/decode.