|
|
|
|
|
by andywarfield
4199 days ago
|
|
Getting things perfect is pretty hard: it basically required being able to see the future. There are still a bunch of examples where imperfect decisions end up being a really significant win. For instance: 1. The counter stacks that we build using the HLLs allow us to draw miss ratio curves. These are a model for how much a given application will benefit from getting additional fast memory (whether that fast memory is flash/ram/etc). A useful aspect of this isn't just trying to give it more memory but also identifying the workloads that you simply can't win on. For workloads with absolutely enormous working sets, you can decide to give them a lot less resources, and then end up speeding up all the other applications in the system. Alternatively, you can decide to give them the large amount of memory that they do need to be successful, possibly by having to buy additional fast memory. In either case, being able to get detailed working set analysis helps make appropriate decisions a lot. 2. 80/20 is completely anecdotal -- the appropriate split is always going to be entirely dependent on workload. That said, as you move out into the tail of the access frequency curve, cost per access increases. So even if you don't move data out to disk -- even with only a single type of storage media -- understanding these curves can allow you to decide to do things like compress the colder half of your storage, trading off some compute work and latency on access for more space. In the case of hybrid systems, strategies like this have the potential to really increase the effectiveness of caches. Totally agree with your other points. There are many factors to consider and balance in any system design. Note that for raw media, magnetic disks are still way more reliable in terms of archiving your bits without data errors over long periods of time than NAND. In either case, good system design needs redundancy and scrubbing to make sure that device errors are caught and recovered from before data is actually lost. |
|