HBase 90.4, problems with I/O we had a very heavy read load on top of a write load, write load bursting 14,000 TX per second, and an average of 8,000 per second - each record around 2k.
Because of the I/O the WAL had to be turned off, this introduced problems when Region Servers occasionally died. Implementation of large regions 10GB, and fairly large HBlocks 512MB, increasing flush sizes to reduce minor compactions. Use of MSLAB to virtually eliminate GC all together, use of large heap 12GB on RS.
Worst problems we experienced was META corruption, that really , really sucked.
Thanks. If there's a more detailed writeup you can point me to that'd be great. I would like to make sure then that all these issues are addressed in the current versions.
0.94+ has MSLAB by default, with HFileV2 (0.92+) we can support much larger regions (20G or bigger).
Curious about the 512M HBlocks, did you have scan-heavy read-load?
14k TX peak per regionserver? x 2k that's 28M/s (56M with WAL). Should be doable now even with WAL (definitely with deferred flush). Well, maybe not with concurrent very heavy read load, depending on disk configuration.
Probably on top of Hadoop 0.20-append?
Hadoop-2.x.x should be far better too.
Because of the I/O the WAL had to be turned off, this introduced problems when Region Servers occasionally died. Implementation of large regions 10GB, and fairly large HBlocks 512MB, increasing flush sizes to reduce minor compactions. Use of MSLAB to virtually eliminate GC all together, use of large heap 12GB on RS.
Worst problems we experienced was META corruption, that really , really sucked.