| HN Mirror

For complex things like MATCH_RECOGNIZE (and CASTs) whose syntax and semantics differ across underlying systems unfortunately the result will be that some data is going to be pulled and be processed in Trino - so it'll be slower than native. If you are only dealing with a single data source (unless it's not an RDBMS, say files on S3 or an API) I'd say Trino is not needed and would slow you down.

The rule of thumb is that Trino aims to provide a uniform layer over whatever sits underneath. So operations which when "pushded down" to the source result in same results as when executed within Trino do get "pushed down" - i.e. executed on the source. But in cases where the results might differ or it's complex to push-down the operation the operation runs within Trino - i.e. pull data from source (minimum needed data) and then perform operation within Trino.

Note that it's not an all-or-nothing case, e.g. a query like:

    SELECT n.name, r.name
    FROM postgresql.tpch.nation n
    LEFT JOIN postgresql.tpch.region r ON r.regionkey = n.regionkey
    WHERE n.name > 'A'
        AND r.name = 'ASIA'

will result in the following query to Postgres:

    SELECT n.name, r.name FROM tpch.nation n LEFT JOIN tpch.region r ON r.regionkey = n.regionkey WHERE r.name = 'ASIA'

The rest of the query (n.name > 'A') would be applied in Trino to results fetched from Postgres because the collation in Postgres will affect results if we push complete query down to Postgres and may not match results when entire query would be processed by Trino. With single data source this is not easily appreciated but e.g. if the second table in the query came from SQL Server then you'd want to have a consistent comparision logic regardless of source of table.

This is a very simplified example though.