Hacker News new | ask | show | jobs
by tqh 1911 days ago
This might be outdated info:

  * Two different Prestos, prestodb and prestosql for maximum confusion. (I think one renamed)
  * Making Controller highly available by default is hard
  * Autoscaling workers is not simple
  * Code very dependent on its own webframework that tries to do everything and lacks docs.
  * Resource planner for multiple queries is lacking
  * Worker configuration takes a lot of skill
All of these could be solved, but in most cases you can find other solutions where you get a simpler set of problems.
2 comments

tqh, Sounds like you hit the nail on the head with your answer. Thank you very much for your insights.
Hey I'm a contributor to prestosql (the one that renamed to Trino). I'll provide a few of my opinions into some of these from the vantage point of our project.

* It's definitely confusing but pretty common in open source projects to see the original creators split off when corporate oversight interferes with the OS governance model. (https://www.computerworld.com/article/2746627/hudson-devs-vo...). This is especially true when, as the OP mentioned, it's a pretty cool tech and a lot of interest in it. Now that the names are different, it is clearing up a bit. We're hoping in a few years there will be one project standing so that you won't have to choose. I don't have to tell you which one I think it is.

* Active-active HA is not really necessary IMO as Trino is designed for low latency interactive queries in general. It can handle longer running batch queries but it gives up fault tolerance to fail fast and you just resubmit the query vs predecessors like Hive, Spark, etc... that handle ETL and long running batch processes efficiently but this adds complexity to the query to checkpoint the work. I could see the need for an active-passive HA to have on deck during a failure. Setting up your own active-passive HA is as simple as putting two coordinators behind a proxy and pointing your workers to the proxy address. Then you basically have the proxy run health checks and flip over in the event of an outage. Here's the issue to track native HA though https://github.com/trinodb/trino/issues/391.

* I'm not sure why autoscaling is said to be difficult. I think this is why you have kubernetes and docker to manage this type of workload.

* The only reason this is a pain to me is that engineers wanting to join our community and commit have a bit of a learning curve and depends heavily on us mentoring and guiding them on how the REST API works, which we don't mind. However, I agree with this choice from a design perspective for the user. If you want to use Trino, it's better not to be exposed to this implementation detail or mess with how this works. It will likely cause you more pain.

* This has improved in the last two years since we branched from PrestoDB 2019 (https://trino.io/blog/2020/01/01/2019-summary.html) and 2020 (https://trino.io/blog/2021/01/08/2020-review.html).

* Agreed, we are working on what's the better model here: https://github.com/trinodb/trino/discussions/6573