| > Ray backend runner Yes, give it a whirl and let us know what you think! Ray is amazing and has actually gotten a lot better post their 2.0 release :) > Is this based on Apache Arrow? Indeed it is, and thanks for the feedback. We'll make this a little more visible. We use the arrow2 Rust crate (same one that Polars uses) for our in-memory data representation. Our data representation makes it such that converting Daft into a Ray dataset (`df.to_ray_dataset()`) is actually zero-copy. So you can go from data transformations into downstream ML stuff in Ray really easily. > It would be AMAZING to be able to take Polars code and run it in a distributed cluster with minimal changes. Unfortunately we don't have Polars API compatibility. This seems to be a recurring theme in this thread though. The problem is that certain Polars expressions are non-trivial to do in a distributed setting, and Polars itself as a project is so young and moves so quickly it's hard for us to maintain 100% API-compatibility. That being said, you are correct that a lot of the API is very much inspired by Polars, which should hopefully make it easy to move between the two. |