Hacker News new | ask | show | jobs
by bothra90 1709 days ago
Is this solving similar problems as Ray [1]?

[1] https://www.ray.io/

2 comments

Hey, I am the author of Fugue.

Fugue is a higher level abstraction compared to Ray. It provides unified and non-invasive interfaces for people to use Spark, Dask and Pandas. Ray/Modin is also on our roadmap.

It provides both Python interface (not pandas-like) and Fugue SQL (standard SQL + extra features). Users can choose the one they are most comfortable with as the semantic layer for distributed computing, they are equivalent.

With Fugue, most of your logic will be in simple Python/SQL that is framework and scale agnostic. From the mindset to the code, Fugue minimizes your dependency on any specific computing frameworks including Fugue itself.

Please let me know if you want to learn more. our slack is in the README of the fugue repo

Fugue repo: https://github.com/fugue-project/fugue Tutorials: https://fugue-project.github.io/tutorials/

What kind of parser does FugueSQL use? Does it use Apache Calcite?
No, we use antlr, we have no dependency on Java.
no
Well, sort of. Fugue overall is a scaling engine like ray. The specific link to yet another SQL access layer to a dataset doesnt really have an analog on ray, but has some nice features.

I love these SQL layers but they can obfuscate how they implement their transforms. So, they can speed up filter and join creation and coding... til something breaks and then you have to go atomic anyway.

Fugue is a translation layer from SQL to underlying runtime: pandas, dask, spark.

Each of the runtimes, supported by Fugue, can be compared to Ray, but Fugue is a tool of a different kind.

That is very true. Thank you.

Fugue SQL is one way, and it also has functional API. They both can be translated into the underlying runtime. You can choose based your preference and real need.

Much better worded than my post above. Yup to this.