|
|
|
|
|
by aravindet
1888 days ago
|
|
I am considering Materialize for a project, and I'm concerned about the limitation that materialized views should fit into the memory of a single host. My use case is a materialized OLAP "fact" tables, created by joining a 5-10 separate tables. While the biggest source tables have under a million rows, the joined fact table would have perhaps 100 million rows which I fear might be too big. One workaround I considered is creating a non-materialized view, TAIL-ing it, and persisting the results to another database. Could you comment on this problem and workaround? |
|
Do you have a limit in mind on how large you're willing to go? A brief bit of napkin math shows that, at 1KB per record, your largest view should fit within 100GB of memory. This is easily supported by Materialize and I can certainly understand if this exceeds your appetite!
If you're interested in reducing the size of the in-memory dataset, does your fact table have a temporal dimension to it? By default, Materialize stores data for all time (like a database) and you can write views in such a way that it will only materialize recent data (like a stream processor). Our co-founder Frank wrote a blog post[1] detailing how to do this.
I'm honestly not sure how TAIL-ing a non-materialized view would perform. Sounds like something that would be fun to test!
Happy to chat about this further in our community Slack channel if you have more questions.
[1] https://materialize.com/temporal-filters/