|
|
|
|
|
by jeffbee
883 days ago
|
|
I think the chief source of inefficiency in this architecture would be the NVMe controller. When the operating system and the NVMe device are at arm's length, there is natural inefficiency, as the controller needs to infer the intent of the request and do its best in terms of placement and wear leveling. The new FDP (flexible data placement) features try to address this by giving the operating system more control. The best thing would be to just hoist it all up into the host operating system and present the flash, as nearly as possible, as a giant field of dumb transistors that happens to be a PCIe device. With layers of abstraction removed, the hardware unit could be something like an Atom with integrated 100gbps NICs and a proportional amount of flash to achieve the desired system parallelism. |
|