Hacker News new | ask | show | jobs
by Archit3ch 206 days ago
> The fp64 and fp32 performance is needed for physical simulations

In the very unlikely case where

1) You need fp64 Matrix-Matrix products for physical simulations

2) You bought the MI355X accelerator instead of hardware better suited for the task

you can still emulate it with the Ozaki scheme.

1 comments

What hardware is better suited for the task? FLOPS per dollar, nvidia is in retreat just as much as AMD is when it comes to fp64.
ARMv9 Scalable Matrix Extension (SME). Apple had outer-product matrix hardware (AMX) since 2019, but you cannot buy the chips by themselves.
Yeah, I saw the presentations at SC25, but I wasn't able to get anyone to commit to being able to buy them in the next year or three. Right now I have two open RFPs and nobody is bidding ARM.