Thanks. Looks good but do the GPUs like "1x H100 80GB NVLINK" support Infiniband?. Are these HGX Modules? (Since standard HGX comes in pairs of 4 or 8). Also, when will H200 be available? Great Stuff !
Thanks for the feedback! NVLink configurations are available only in 8x, it’s not possible to deploy 1x. These are PCIe modules connected via NVLink bridges, and they are not HGX.
We do plan to expand our fleet with B200s, but H200s are not currently on the roadmap for on-demand.
We’ve just activated the rest of your credits—thank you again for your feedback!
1. I tried with L40 for inference in MON1 with
BentoML [@bentoml.service(resources={"gpu": 1})
class MyService:
def __init__(self):
import torch
self.model = torch.load('model.pth').to('cuda:0')]
and
2. RTX A4000 for training using pytorch training script [ device = "cuda" if torch.cuda.is_available() else "cpu"
device] in OSL1 with sshkeys for root.
Old CUDA 12.2 is provided with Mainline Ubuntu LTS 22.04 LTS as Latest Ubuntu.
It is unstable (mostly) with driver upgrades as Device drops out though works intermittently !!!
Start Scripts for pyenv and CUDA:
#!/bin/bash
DEBIAN_FRONTEND=noninteractive apt-get -y install wget zip unzip git jq
We do plan to expand our fleet with B200s, but H200s are not currently on the roadmap for on-demand.
We’ve just activated the rest of your credits—thank you again for your feedback!