Hacker News new | ask | show | jobs
by gfody 2025 days ago
there are messages like this in the docs:

> "WARNING! LATERAL subqueries can be very expensive to compute. For best results, do not materialize a view containing a LATERAL subquery without first inspecting the plan via the EXPLAIN statement. In many common patterns involving LATERAL joins, Materialize can optimize away the join entirely. "

I take this to mean that Materialize cannot always efficiently maintain a view with lateral joins - that's fine neither can SQL Server, but it would be nice if I could find all these exceptions in one place like I can for SQL Server.

..fwiw I prefer the behavior of failing early rather than letting potential severe performance problems into prod.

[1] https://materialize.com/docs/sql/join/#lateral-subqueries

1 comments

> I take this to mean that Materialize cannot efficiently maintain a view with lateral joins [...]

Well, no this isn't a correct take. Lateral joins introduce what is essentially a correlated subquery, and that can be surprisingly expensive, or it can be fine. If you aren't sure that it will be fine, check out the plan with the EXPLAIN statement.

Here's some more to read about lateral joins in Materialize:

https://materialize.com/lateral-joins-and-demand-driven-quer...

sorry you missed my ninja-edit - it sounds like SOME lateral join queries CAN be efficiently maintained but not ALL (not the ones that are surprisingly expensive for whatever reason) that's where the promise of "we can materialize any query!" starts to fall apart for me. presumably the surprisingly expensive cases are the ones where some rewrite rules can't guarantee correctness without hiding indexes or predicate pushdowns or whatever - the doc says review the explain plan first but what precisely about the explain plan would tell me that the materialized view won't be efficiently maintained? ideally these cases can be known ahead of time so I can come up with a conformant query rather than trying variations to see what works.

..and more to the point, there are obviously limits to what can be efficiently maintained. I would love to see that list as this is what would give me a good idea of how Materialize compares to my daily driver RDBMS which happens to be SQL Server and whose limits I'm unfortunately intimately familiar.

I don't think there is anything fundamentally different from an existing database. In all relational databases, some lateral joins can be expensive to compute. In Materialize, those same lateral joins will also be expensive to maintain.

I'd be surprised to hear you beat up postgres or SQL Server because they claim they can evaluate any SQL query, but it turns out that some SQL queries can be expensive. That's all we're talking about here.

I am genuinely interested in Materialize's capability to incrementally maintain views and I understand there are all sorts of limitations as to when that's even possible - I can't find a comprehensive list of them. I don't think it's fair to say you support every possible select statement and then just have some of them be slow. The lateral join case was the first warning I encountered in the docs - is that the ONLY case and every other possible select statement can be incrementally maintained?
All queries are incrementally maintained with the property that we do work proportional to the number of records in difference at each intermediate stage of the query plan. That includes those with lateral joins; they are not an exception.

I'm not clear on your "all sorts of limitations"; you'll have to fill me in on them?

> I'm not clear on your "all sorts of limitations"; you'll have to fill me in on them?

this feels like bait but honestly I'm under the impression that incrementally updating materialized views (where optimal = the proportion of changed records) just isn't always possible. for example, max and min aggregates aren't supported in SQL Server because updating the current max or min record requires a query to find the new max or min record - that's not considered an incremental update and so it's not supported and trying to materialize the view fails. there are a number of cases like this and a big part of problem solving with SQL Server is figuring out how to structure a view within these constraints. if you can then you can rest assured that updates will be incremental and performant - this is important because performance is the feature, if the update is slow then my app is broken. if Materialize has a list of constraints shorter than SQL Server's then you're sitting on technology worth billions - it's hard for me to believe that your list of constraints is "there are none" especially when there are explicit-but-vague performance warnings in the docs.