Hacker News new | ask | show | jobs
by ChuckMcM 3330 days ago
This is perhaps the most interesting aspect of it to me. When we relax the constraint that 'mass storage access must be a linear and infrequent as possible' what sort of possibilities does that open up in the design space that were previously untenable.

Nice work and thank you for sharing it.

2 comments

It's not that easy, actually. The simplest method that can utilize the full throughput of the drive is to use large writes (1MB or larger). This is the fastest possible way to write data to the SSD, period. This method also creates the simplest possible FTL mapping table.

Random reads and writes are significantly slower if you write everything from one thread. To speed everything up you should write in parallel (for example using Linux AIO + O_DIRECT, or libuv + O_DIRECT). OS level buffering and many OS threads will deliver good random write throughput as well.

There are other effects to consider, e.g. read-write interference.

I understand. I would expect that you will get an additional boost if you target Intel's 'Optane' technology which, by its design, allows for a much faster channel turnaround and so less interference. And in the fairly recent past other vendors like Texas Memory systems developed strategies which were all RAM and a bit of cleverness to snapshot to HD when the power fails. The point being that with enough money you could brute force the solution, but now the money required it decreasing and so new strategies are opening up.
If I understand this right, with Intel's Optane you will eventually need to write everything to HDD because data collection happens at steady pace and the cache size is limited.
Depends on the size of your data set. Intel's plan, according to their web site, is to replace the SSDs (especially NVME ones) with Optane based solid state memory. The road map has them shipping exabytes of the stuff eventually.

So as I see it you'd be constrained by 32GB Optane modules today, but they will eventually (one, maybe 2 years) be 2 TB modules like the Samsung 960 Pro modules are today. And an M.2 port is really just a PCIe slot so you're looking at systems with maybe 32 TB of Optane storage on the high end within the next 5 years.

My understanding is that even SSDs perform somewhat better sequentially (throughput-wise), though the difference isn't quite as dramatic as with HDDs. That said, the 400 mb/s random write speed for nvme mentioned is plenty faster than sequential write speeds most people had access to until recently with SSDs, so that's pretty interesting.