Hacker News new | ask | show | jobs
by cjensen 241 days ago
This is asking too much. The management of trim, reallocation, wear leveling, and so much more is very complex. It's a full software stack hiding behind the abstraction of NVMe. Every manufacturer is running a different stack with different features and tradeoffs. The "stats" the author is asking for would be entirely different between manufacturers, and I doubt there is that much to be gained from peering behind the curtain.
2 comments

It has been done previously for CPUs, which are much more complex than SSDs. Why couldn’t each manufacturer expose whatever performance metrics there are, in whichever way they want (as the post argues, eg., through SMART), and then let system engineers exploit this information to optimize their use-cases?
Seems like a poor example since CPU performance metrics differ not only between ISAs, and between vendors of one ISA (AMD vs. Intel, for example) but also between items from a single vendor. There's a 1000-page PDF that tries to explain what all the Intel PMU counters mean on different CPUs and it's full of errors and omissions as well.
Yes, but these differences don’t really matter. There are multiple techniques that system engineers can use to perform both variable selection and regularization (to help with differences across multiple architectures) to help them select counters that matter for their specific use cases.

But then saying “it is too much to ask” is just another way to limit what user can do with the specific resources they paid for.

The abstraction is the problem. Get rid of the translation layer, manage flash directly in the operating system, and suddenly the ambiguity dissolves. You would get meaningful, uniform statistics with semantics necessarily matching those used by your operating system.
Do I really want my relatively expensive general-purpose CPU to be burdened with the task of managing flash using software, when a relatively inexpensive ASIC does that job very quickly and efficiently?

There's a lot of non-trivial stuff that goes on inside of a modern SSD. And to be sure, none of it is magic; all of it could certainly be implemented in software.

But is that kind of drastic move strictly necessary in order to get meaningful statistics?

There would be other benefits, such as reduced write amplification and better workload isolation. The observability would just be gravy.
ssds aren't using Asics. they're full blown computers. Apple has moved to ssd control on soc and it seems to work for them.
Apple moving SSD control to a hardware block in their own custom chip is not the same thing as implementing the functionality using software.

(You've heard about apple and orange comparisons, right? Right.)

I would call that thing running on that custom chip software.
You don't need me or anyone else to tell you that you're free to call it whatever you want.

I'm going to keep referring to the QuickSync video encoding block in my CPU as "hardware," though, because the tiny lump of transistors that is dedicated to performing this specialized task is something that I can kick.

Relatedly, the business of managing raw NAND storage on Apple devices and abstracting it to operating system software as NVMe: That translation happens in hardware. That hardware is also something that I can kick, so I'm going to keep calling it "hardware".