|
|
|
|
|
by areddyyt
646 days ago
|
|
Thank you! CUDA and Nvidia are practically impenetrable on the server side. To be very concrete, we did training for our models on AWS with parallel cluster. We used P5 instances (8xH100) that were scheduled with SLURM. A problem we ran into however, was that our training jobs were containerized. Thankfully, pyxis and enroot exist to run containerized jobs on SLURM. And who else, but Nvidia, develop and maintain those plugins. For practically any weird niche use case, Nvidia seems to have some software solution - but only on x86. Jetson is a whole other beast. There is no guarantee any pip package you install has an aarch64/arm64 wheel. For example, we could not use torch_tensorrt, to compile to TensorRT via Torch Inductor. Why? Because the Bazel build system was only configured to build for Jetpack 4.6 or Jetpack 5.1, and we were using Jetpack 6. While Nvidia provides docker images for x86 systems that come with torch_tensorrt installed, their L4T (Linux for Tegra) images do not. Instead we had to manually write out a new workspace file and compile for Jetpack6 to provide TensorRT compiling support. tl;dr: Nvidia and CUDA have a great walled garden on x86, not so much on their edge computing devices |
|
Do you think otherwise?