Hacker News new | ask | show | jobs
by jaychia 1109 days ago
> Ray backend runner

Yes, give it a whirl and let us know what you think! Ray is amazing and has actually gotten a lot better post their 2.0 release :)

> Is this based on Apache Arrow?

Indeed it is, and thanks for the feedback. We'll make this a little more visible. We use the arrow2 Rust crate (same one that Polars uses) for our in-memory data representation.

Our data representation makes it such that converting Daft into a Ray dataset (`df.to_ray_dataset()`) is actually zero-copy. So you can go from data transformations into downstream ML stuff in Ray really easily.

> It would be AMAZING to be able to take Polars code and run it in a distributed cluster with minimal changes.

Unfortunately we don't have Polars API compatibility. This seems to be a recurring theme in this thread though. The problem is that certain Polars expressions are non-trivial to do in a distributed setting, and Polars itself as a project is so young and moves so quickly it's hard for us to maintain 100% API-compatibility.

That being said, you are correct that a lot of the API is very much inspired by Polars, which should hopefully make it easy to move between the two.

1 comments

Please at least reach out to the Polars folks and see what's possible.