Hacker News new | ask | show | jobs
by lichtenberger 2118 days ago
You can efficiently read 256 Byte granular data (4 cache lines) with Optane Memory (due to checksums). I think it makes much more sense to read/write fine granular changes for instance at least align pages to 64 or 256 Bytes instead of 4kb pages, where you often times first of all write too much data and secondly you pollute the caches with probably unnecessary data. There's a paper about how to add cache line aligned mini-pages (16 cache-lines): https://db.in.tum.de/~leis/papers/nvm.pdf
3 comments

I think along those lines, reading and writing fine granular data, especially when storing the full history of the data is the way to go (in addition to path copying and a log-structured storage). Furthermore pointer swizzling might be needed to get almost in-memory database performance.

I'm trying to address this with my versioned Open Source DBMS project, where only page-fragments are stored and bunch of them is read in-parallel to reconstruct a full page in-memory. Adding mini-page ps plus a simple cache at some point with hot data from several page fragments at least is orthogonal:

https://github.com/sirixdb/sirix | https://sirix.io

Writing on a page (4KB / 8KB) boundary is almost always a good idea. Because there is an overhead per page that you have to account for. It can be in the range of 16 - 64 bytes, so having a 256 mini page is probably a bad idea.

Most of the _data_ is also not going to fit in 256 bytes anyway.

It might add some overhead, but I guess it depends on the page implementation, but 16 bytes seems to be the minimum (and Optane Memory might , I agree. That said if someone changes only one record the best thing is to write in the smallest granularity possible on the storage medium.

So, it might well be that someone is only interested in only one or a few records. Why then fetch and cache a whole 4Kb page if latency is good in both cases (4kb and 256 bytes)? On the other hand I agree that you should probably cache more data from a hot page.

FYI, Intel Optane Memory is a wildly different product from the one under discussion, which is Optane DC Persistent Memory or Optane DCPMM. The former is a low-capacity consumer SSD caching software solution, and the latter is persistent memory modules in a DIMM form factor.
Yes, I mean Optance DC Persistent Memory. It's hard because they seems to market three different product categories under almost the same name (Optane Memory, Optane DC Persistent Memory and SSDs).