| HN Mirror

1. I tried with L40 for inference in MON1 with BentoML [@bentoml.service(resources={"gpu": 1}) class MyService: def __init__(self): import torch self.model = torch.load('model.pth').to('cuda:0')]

and

2. RTX A4000 for training using pytorch training script [ device = "cuda" if torch.cuda.is_available() else "cpu" device] in OSL1 with sshkeys for root.

Old CUDA 12.2 is provided with Mainline Ubuntu LTS 22.04 LTS as Latest Ubuntu.

It is unstable (mostly) with driver upgrades as Device drops out though works intermittently !!!

Start Scripts for pyenv and CUDA: #!/bin/bash

DEBIAN_FRONTEND=noninteractive apt-get -y install wget zip unzip git jq

DEBIAN_FRONTEND=noninteractive apt-get install -y python3-pip

# Required for pyenv make DEBIAN_FRONTEND=noninteractive apt-get install -y build-essential zlib1g-dev libffi-dev libssl-dev libbz2-dev libreadline-dev libsqlite3-dev liblzma-dev libncurses-dev tk-dev DEBIAN_FRONTEND=noninteractive apt install -y --reinstall gcc-12 ln -s -f /usr/bin/gcc-12 /usr/bin/gcc # Remove existing nvidia drivers and install cuda 12.6.1 DEBIAN_FRONTEND=noninteractive apt-get -y remove --purge *nvidia* DEBIAN_FRONTEND=noninteractive apt-get -y remove --purge *cuda* DEBIAN_FRONTEND=noninteractive apt-get -y remove --purge *nvrtc* DEBIAN_FRONTEND=noninteractive apt-get -y autoremove --purge wget https://developer.download.nvidia.com/compute/cuda/12.6.1/lo... && sh cuda_12.6.1_560.35.03_linux.run --silent --override DEBIAN_FRONTEND=noninteractive apt-get -y update DEBIAN_FRONTEND=noninteractive apt-get -y upgrade DEBIAN_FRONTEND=noninteractive reboot