|
|
|
|
|
by covi
374 days ago
|
|
To massively increase the reliability to get GPUs, you can use something like SkyPilot (https://github.com/skypilot-org/skypilot) to fall back across regions, clouds, or GPU choices. E.g., $ sky launch --gpus H100 will fall back across GCP regions, AWS, your clusters, etc. There are options to say try either H100 or H200 or A100 or <insert>. Essentially the way you deal with it is to increase the infra search space. |
|