Beautiful effort. I'm obsessed with sharing gpus across workloads, albeit my interest has more dwealt in virtual desktop/vdi/cloud gaming. The people making the chips just want to charge sooo much money for sharing a gpu, seem to make it hard as they can.
Take a look at the Medium article [1] and it will be clear to you that this is not the same.
Each complete GPU sharing approach must have:
- A mechanism to facilitate sharing (security, isolation, avoiding OOM errors).
- A K8s integration.
Most approaches (like the one you mentioned above) lack a mechanism and simply work around the 1-1 GPU allocation on Kubernetes by advertizing more devices per physical GPU.
Those are not viable solutions.
Please take a look at Paragraph 5 ("The real challenge of GPU virtualization on K8s") onwards as well as the repo notes.
Nice medium article on the challenges here too. https://grgalex.medium.com/gpu-virtualization-in-k8s-challen...