Hacker News new | ask | show | jobs
by jmpe 4378 days ago
First: I jumped into the comments before reading the article.

I have quite some experience with Flash in the form of eMMC and SD/CF. SSDs aren't that much different from those on the low level.

The controller that comes with the flash storage contains a core that manages the bad blocks. Comparable to bad sector management on HD. The software these controllers run contain a lot of rules of thumb to manage bad blocks, which is where these full failures come from IMO.

Each controller has access to a pool of reserve blocks that are used when bad blocks are detected. Once those run out the embedded software starts showing weird behavior when using the device and shortly after there's a complete fail.

I think the pool of reserve blocks is "Used_Rsvd_Blk_Cnt_Tot" in your list. Apparently there are 100, of which you consumed 0. There's a threshold at 10 so I assume that's where the diagnostics software will warn you.

1 comments

> Apparently there are 100, of which you consumed 0.

The 100 is a normalized number, it's not the actual number of blocks. (A percentage basically, so 100% are still left.)

If the drive used any blocks at all I'd worry about it, I would not consider it a wear indicator but rather a failure indicator.

> The 100 is a normalized number, it's not the actual number of blocks.

I'm not too sure about that. The only ref I can give is that they use the suffix "Cnt_Tot" which means "total count". When it's a percentage they denote it as such as in "Perc_Rated_Life_Used" and "Workld_Host_Reads_Perc". Don't be surprised by the low count (100).

That's how SMART attributes work. 100 means AOK and 0 means failed. The normalized number is reported by the drive, calculated by a formula based off the raw values and MTBF data determined by the manufacturer.
Ok, dug around in the code.

In Smartmontools I found the code for this variable:

http://smartmontools.sourceforge.net/doxygen/atacmds_8cpp_so...

It's code 179 (0xB3).

From Samsung's website:

http://www.samsung.com/global/business/semiconductor/minisit...

  ID # 179 Used Reserved Block Count (total)

  This attribute represents the number of reserved blocks that have been used as a result of a read, program or erase failure. This value is related to attribute 5 (Reallocated Sector Count) and will vary based on SSD density.
.. so at least for samsung there's use of exact numbers.

From Intel's website:

http://download.intel.com/newsroom/kits/ssd/pdfs/intel_ssd_5...

(Ctrl-F for "Available Reserved Space")

.. they use a normalized value (100).

So it can be either percentage or absolute value, depending on manufacturer.