|
|
|
|
|
by dandrews
3759 days ago
|
|
Something I was unaware of: single bit (correctable) read errors are not immediately repaired in NAND. Subsequent in-error reads are repeatedly processed by ECC on the fly, while the media rewrite is scheduled for sometime later. This inflates the observed correctable error rate to some extent. |
|
For any given type of flash chip the manufacturer will provide a spec like "you must be able to correct N bits per M bytes". The firmware for a flash drive must use forward error correction codes (e.g. BCH or LDPC) of sufficient strength to correct the specified number of bit errors.
Dealing with a certain amount of bit-errors is just part of dealing with flash.
For example, a chip could have a spec that you must be able to correct up to 8 bit errors per 512 bytes (made up numbers). If the chip had 4KiB pages, each page would be split into 8 "chunks" that were each protected by error correcting codes that were capable of correcting up to 8 single-bit errors in that chunk. As long as no "chunk" had more than 8 bit errors, the read would succeed.
So in this case you could theoretically have a page read with 64 bit errors that succeeded.
This is alluded to in the paper: "The first generation of drives report accurate counts for the number of bits read, but for each page, consisting of 16 data chunks, only report the number of corrupted bits in the data chunk that had the largest number of corrupted bits."