Hacker News new | ask | show | jobs
by bafe 937 days ago
I briefly tried windmill for a project that involved creating custom workflows on-demand from a configuration file. I can't recommend windmill for data pipelines, it is meant to be more of a low-code internal app platform like retool or budibase. It is meant for a relatively static workflow that requires human intervention, like a simple business process involving some API calls and humans approvals in the loop. For complex (and potentially reconfigurable) data pipelines, dragster is a much better choice
3 comments

When you tried, we didn't have s3 integrations nor restartable flows. We will present all of this in day 5 of our launch week and it might be a good time to revisit.

I agree we were not a good fit prior. I think we would now compare favorably as we will offer excellent ergonomics for data processing, leveraging polars, duckdb, and other OLAP libraries to their full extent.

It was several months ago, so it is entirely possible. Back then I got the feeling that windmill tried to be more of a low code business/internal tool platform than a data/ETL workflow tool, I especially missed an expressive way to define workflows programmatically (i think you had a JSON schema but nothing as powerful as dragster where you can define a whole workflow in pure python)
We added a vscode extension to build everything from code, including manipulating the yaml or UI and have them instantly impact each other: https://www.windmill.dev/blog/launch-week-1/vscode-extension

yaml is not real code and so the part that I consider to be real code is that each step is its separate file in python, typescript, that you can edit in your code editor and have your plugins working, testing frameworks. It's normal functions that you can run locally.

It would be possible for us to do like dagster/prefect/airflow which is to use a macro-processing step/decorators to build dynamically the graph which is what our yaml is in the end, a 1:1 encoding of our dag spec called openflow: https://docs.windmill.dev/docs/openflow/

We didn't do it yet because in most cases, the decorators are a lot like yaml, they are a very rigid way of declaring that some functions are nodes that you can chain in limited ways. On the other hand, not providing that mode allow us to put more efforts in the low-code graph builder for now.

But, as someone that love compilers and built a few, I'm very eager for us to provide such a mode so it's probably a few months away :)

We'd love to have your input on the DX for data/ETL once we present it Friday so feel free to join our discord or shoot me an email at ruben@windmill.dev

Thank you for the extensive reply! The first part you mentioned is exactly what we were missing back then, we wanted to dynamically generate workflows starting from a configuration selected by the user. This wasn't really possible unless we would generate the YAML openflow specification ourselves. At the end we gave up and rolled our own simple tool that just does what little we need. This said, it is cool that you are considering offering a more code-friendly way to define workflows. I still think this doesn't offer the same level of dynamism of dragster, where you could easily design branching/conditional workflows. I suppose your considerations regarding the decorators/compilers go exactly in that direction
Yes it goes in that direction, however note that you can already do this in a not too hard way.

Our openflow spec is both open-source and has a full openapi definition: https://github.com/windmill-labs/windmill/blob/main/openflow...

you can use that to generate client sdks in any languages and build your own dag with it. That's what one of our customer did building a reactflow to openflow library: https://github.com/Devessier/reactflow-to-windmill

It's not as good as the decorator way but we move fast and if you still have interest for it we could prioritize it (and ask for feedbacks :))

Thanks Ruben - looking forward to the Dagster comparison post in the coming days.
I find dagster to be quite poor at dynamism. The software defined asset is fundamentally right but dagster just doesn't nail it IMO
Thanks - great to hear your perspective.