tldr: uses the latest rocm 6.2 to run full precision inference for llama 405b on a single node 8 x MI300x AMD GPU
How mature do you think Rocm 6.2-AMD stack is compared to Nvidia ?