Hacker News new | ask | show | jobs
Cloudflare's data platform and the AI agent on top of it (blog.cloudflare.com)
3 points by jgrahamc 14 days ago
4 comments

> PII is opt-in per session. By default, Trino redacts sensitive columns before they ever hit your screen. If you have a legitimate need for raw PII (e.g., fraud investigation), you flip the bit on the session, your permissions are checked, and the redaction is lifted. The flip and every query is logged.

That's an interesting thing to say. Something inside me says log all queries so that someone can come in later and figure out that hey, 30% of queries in Q1 involve the Foo system and it is slow/expensive, and then go and optimize the Foo system and save the company specific amounts of time/money.

Interesting.

> Apache Trino for that: a single SQL query can join a Postgres table, a ClickHouse table, and an Iceberg table on R2 without a need to materialize the intermediate results into a different system.

How does it manage to do that? I'd think you'd need intermediate results somewhere.

Congratulations for this ! Did you consider https://github.com/obi1kenobi/trustfall to query all your data sources ? If yes, what were the limits of this engine ?
Nice writeup, one nit though – I don't think Trino is in ASF :)