| Very cool. I noticed that all the write benchmarks in https://apple.github.io/foundationdb/benchmarking.html are for random writes. Is write throughput affected by highly-sequential writes (e.g. - time series) vs random writes? How do you avoid hot-spotting on recent ranges? How efficient are range deletes? On https://apple.github.io/foundationdb/performance.html I read "The memory engine is optimized for datasets that entirely fit in memory, with secondary storage used for durable writes but not reads." I'd like some clarification: (1) Which memory does "entirely fit in memory" refer to? A single machine? Or SingleNodeMemory * Nodes / ReplicationFactor? (2) If only recently-written data is likely to be queried, and all recently-written data fits entirely in memory, is that sufficient? If so, would an unexpected query of old data cause a huge impact on write throughput? (3) What is the structure/format of the data stored on disk? How is it updated? I'm wondering how well this could be used for time series data. I saw mention here that wavefront uses FoundationDB for this, but would like more details if any are available. |
You can mitigate by designing your key structure/data ordering to not have that property.
The memory engine requires your data to fit in memory (total across all your nodes, after replication). It writes interleaved snapshots and updates to disk, and reads the whole dataset back into memory when restarted.
You can do great modeling of time series data in FDB, though it will take some care and thought.
You should ask these questions on the forum. This article is falling off HN, I am going to lose track of it, and it doesn't look like the Apple team is answering questions here.