Hacker News new | ask | show | jobs
by calebtv 1193 days ago
Thanks! The process function runs as a Ray Actor (https://docs.ray.io/en/latest/ray-core/actors.html). So we have the same serialization requirements as Ray (https://docs.ray.io/en/latest/ray-core/objects/serialization...)

I think the most common limitation will be ensure that your output is serializable. Typically returning python dictionaries or dataclasses is fine.

But if you had a specific limitation in mind let me know happy to dive into it!

2 comments

One other thing I should mention that's relevent, we do also have a class abstraction instead of a decorator: https://github.com/launchflow/buildflow/blob/main/buildflow/...

This can help with things like setting up RPC clients. But it all boils down to the same runner whether you're using the class or decorator.

Do you see this as a direct competitor to Ray's built-in workflow abstraction https://docs.ray.io/en/latest/workflows/management.html

Exciting to see more libraries built on Ray in any case!

Great question! We actually looked at using the workflow abstraction for batch processing in our runner, but ultimately didn't because it was still in alpha (we use the dataset API for batch flows).

I think one area where we differ is our focus on streaming processing which I don't think is well supported with the workflow abstraction, and also having more resource management / use case driven IO.

Makes a ton of sense! I was present at the demo for this at last year's Ray conference and I definitely got the sense that a lot of the orchestration details were still being thought through, and that it was not yet a first-class streaming product.

Definitely like seeing more streaming-focused orchestration tools out there - it's a growing niche with not enough alternatives to Beam

We're thinking about attending this year's conference, so maybe we'll see you there :)