| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by pydry 1324 days ago
	A) Why async in the user code? Is it really necessary? B) Can you mock pipeline events?

2 comments

shykes 1324 days ago

> A) Why async in the user code? Is it really necessary?

We support both sync and async mode. The Python ecosystem is in a state of flux at the moment between sync and async, so it seemed like the best approach to offer both and let developers choose.

> B) Can you mock pipeline events?

Could you share a bit more details on what you mean, to make sure I understand correctly?

link

pydry 1323 days ago

>We support both sync and async mode. The Python ecosystem is in a state of flux at the moment between sync and async, so it seemed like the best approach to offer both and let developers choose.

Correct me if I'm wrong but it doesn't seem like async will really offer anything a pipeline framework might need?

>Could you share a bit more details on what you mean, to make sure I understand correctly?

I often find myself debugging pipeline complex release workflows that get triggered by a git tag or something.

If I use this thing with github, how can I debug that workflow when it fails?

link

vvladymyrov 1323 days ago

Related question - as pipelines are not python programs - it would be nice to be able to write unit tests to pipelines code :) And that's where mock pipelines/tasks could be helpful

link

shykes 1323 days ago

You can absolutely test Dagger pipelines, in the same way you test the rest of your code. Just use the dagger library in your tests files, the way you would use any other library. It should work the way you expect.

link

helderco 1323 days ago

> A) Why async in the user code? Is it really necessary?

It's not a requirement, but it's simpler to default to one and mention the other. You can see an example of sync code in https://github.com/helderco/dagger-examples/blob/main/say_sy... and we'll add a guide in the docs website to explain the difference.

Why async?

It's more inclusive. If you want to run dagger from an async environment (say FastAPI), you don't want to run blocking code. You can run the whole pipeline in a thread, but not really taking advantage of the event loop. It's simpler to do the opposite because if you run in a sync environment (like all our examples, running from CLI), it's much easier to just spin an event loop with `anyio.run`.

It's more powerful. For most examples probably the difference is small, unless you're using a lot of async features. Just remove async/await keywords and the event loop. But you can easily reach for concurrency if there's benefit. While the dagger engine ensures most of the parallelism and efficiency, some pipelines can benefit from doing this at the language level. See this example where I'm testing a library (FastAPI) with multiple Python versions: https://github.com/helderco/dagger-examples/blob/main/test_c.... It has an obvious performance benefit compared to running "synchronously": https://github.com/helderco/dagger-examples/blob/main/test_m...

Dagger has a client and a server architecture, so you're sending requests through the network. This is an especially common use case for using async.

Async Python is on the rise. More and more libraries are supporting it, more users are getting to know it, and sometimes it feels very transitional. It's very hard to maintain both async and sync code. There's a lot of duplication because you need blocking and non-blocking versions for a lot of things like network requests, file operations and running subprocesses. But I've made quite an effort to support both and meet you where you're at. I especially took great care to hide the sync/async classes and methods behind common names so it's easy to change from one to another.

I'm very interested to know the community's adoption or preference of one vs the other. :)

link