|
|
|
|
|
by alexeldeib
234 days ago
|
|
as someone in the space this ticks a lot of boxes: kubernetes-native, strong isolation, python sdk (ideal for ML scenarios). devmapper is a nice ootb approach. Glancing at the readme, is your business model technical support? Or what's your plan with this? Anything interesting to share around startup time for large artifacts, scaling, passing through persistent storage (or GPUs) to these sandboxes? Curious what things like 'Multi-node cluster capabilities for distributed workloads' mean exactly? inter-VM networking? |
|
By multi-node I mean so far I only support 1 k8s node, i.e. 1 machine, but soon adding support for multiple. Still, on 20 CPUs I can run +50 VM pods with fractional vCPU limits.
For GPU passthrough: not possible today because I use Firecracker as VMM. On roadmap: Add support for Qemu, then GPU passthrough possible.
Inter-VM networking: it's already possible on single-node: 1 VM = 1 pod. Can have multiple pods per node (have a look at utils/stress-test.sh). Right now I default deny-all ingress for safety (because by default k8s allows inter pod communication), but can make ingress configurable.
Startup time: a second, or a few seconds, depending on which base image (alpine, ubuntu, etc...) and whether you use a before_script or not (what I execute before the network lockdown)
Large artifacts: you can configure resource allocated to a VM pod in the sandbox config and it basically uses k8s resource limits.
Let me know if any other question! Happy to help