Hacker News new | ask | show | jobs
by frognumber 907 days ago
It's even worse.

Order of operations:

1. It just silently stopped working, started crashing, and I had no idea what was going on or why. Just sort of intermittent f-age where some older versions would work better, and newer ones would work worse. I never got it reliably working. In most cases, at some point, the system would become unresponsive, and then crash hard.

2. AMD removed it from the supported list (with no notice).

3. The github ticket above was filed, which explained what was going on.

Lots of time wasted debugging. This interacted with a half-dozen other AMD bugs and issues (such as learning the GPU only worked for compute headless; I needed to drive my monitor with a different card).

Human time is more expensive than equipment, so the total cost of this stuff was astronomical.

Yes, notice is industry-standard, but at the very least, when support broke / was later removed, the driver could try warning me "We no longer support this card, do you REALLY want to proceed?" rather than letting me know by hard-crashing my system.

1 comments

People don't seem to understand how ROCm fails. Some inaccessible list somewhere buried deep down or a GitHub issues says your card is dropped. Apparently the average user is supposed to spend ages researching this. When you try it out, as any sane person does, you don't get a nice "unsupported GPU" message from ROCm. The failure when you use ROCm is instability of your entire OS, not some clean crashing of the program you ran. This invites lots of messing around and desperately trying to get it to work and is a very frustrating experience and then all people do is say "look in this obscure GitHub issue they dropped support for you GPU, you're the one in the wrong".