Hacker News new | ask | show | jobs
by dr_zoidberg 2351 days ago
Ok, I've studied the Flash storage (most SSDs these days) technolgy and can be understood like this:

* At the "lowest" level, there's a little cell that it's very much an EEPROM (but better, because newer tech). This little cell can hold 1, 2, 3 or 4 bits, depending on gen/tech.

* You group a bunch on those cells together and they form a page. Usually it's 1024 cells a page.

* You group a bunch of pages together and they form a block (don't confuse with "block" as in "block oriented device"). Blocks are usually made of 128 pages.

* You group a bunch (1024 usually) of blocks together and you get a plane.

* You get your massive storage by grouping a lot of planes together. Think of it as small (16-64 MB) storage devices that you connect in a RAID-like manner.

* Operations are restricted because of technology. On an individual level, cells can only be "programmed", that is, a 1 can bit flipped into a 0, but a 0 cannot be made a 1.

* If you need to turn a 0 into a 1, then you must do it on a block level (yep, 128 pages at a time).

* That's where the Flash Translation Layer kicks in: it's a mapping between the (logical) sectors (512b or 4096b) and the underlying mess. The FTL tells you how you form the sectors (which would be the blocks of a "block oriented device", but I'm trying to avoid that word).

* You also have "overprovisioning" at work - that is, if your SSD is 120gb, it's actually 128GB inside, but there's 8GB you don't get access (not even at the OS level), that the device uses to move things around.

* Wear Leveling/Garbage Collection mechanisms work to prevent individual cells from being used too much. Garbage Collection makes sure (or tries) that there are always enough "ready to program" cells around.

* The firmware makes everything work transparently to the world above it.

That would be a very (very very) simple explanation of how Flash storage works. Things like memory cards and thumb drives usually don't get overprovisioning nor wear leveling.

1 comments

Your quantities are way off if you're trying to describe the kind of NAND flash that goes into SSDs. Typical page sizes are ~16kB plus room for ECC, so a page is several thousand physical memory cells, not just one thousand. Erase blocks are several MB, so at least a thousand pages per erase block. A single die of NAND typically has just 2 or 4 planes, each of which is at least 16GB.
Largest sizes I'm finding for erase blocks are 128 and 256KB, not several MB. I am finding larger plane sizes, that probably comes from grouping more blocks together. In general, it's not massively different from what I described, it's just a difference in sizes involved at the higher levels.
It still sounds like you're looking at tiny (≤4Gb) flash chips (or NOR?) for embedded devices, not 256Gb+ 3D NAND as used in SSDs, memory cards and USB flash drives. Micron 32L 3D NAND (released 2016) had 16MB blocks for 2-bit MLC, ~27MB blocks for 3-bit TLC. SK Hynix current 96L TLC has 18MB blocks, and even their last two generations of planar NAND had 4MB and 6MB blocks.

Having only 2 or 4 planes per die with per-die capacities of 32GB or more is a big part of why current SSDs need to be at least 512GB or 1TB in order to make full use of the performance offered by their controllers. 265GB SSDs are now all significantly slower than larger models from the same product line.