Hacker News new | ask | show | jobs
by conshama 624 days ago
related: https://www.nonbios.ai/post/deploying-large-405b-models-in-f...

tldr: uses the latest rocm 6.2 to run full precision inference for llama 405b on a single node 8 x MI300x AMD GPU

How mature do you think Rocm 6.2-AMD stack is compared to Nvidia ?

1 comments

this uses vllm?
Yes.