|
|
|
|
|
by shrubble
2117 days ago
|
|
If I am understanding you properly, couldn't you do this with SQL by specifying the range of data that represents the timeseries you care about, selected via materialized view? If you used a stored procedure to compute the range that becomes the view, then all you need to store are the parameters to feed to the stored procedure again, which data you could itself store in a separate table. |
|
Given just 100M keys (let’s call it a 20GB exported snapshot size), and 1M versions, that’s an overwhelming amount of data — and 99.9999% of it is redundant copies of the same information, i.e. the stuff that didn’t change between versions.
Solving the problem of the concurrent materializations requiring petabytes of storage for almost-entirely-redundant heap tuples, is essentially solving the problem of creating a tuple-deduplicating DBMS storage engine — which is equivalent to the problem of building a versioned embedded database :)