Hacker News new | ask | show | jobs
by shykes 1324 days ago
Hi everyone, Dagger co-founder here. Happy to answer any questions.

We also released an update to the Go SDK a few days ago: https://dagger.io/blog/go-sdk-0.4

6 comments

It looks awesome! I have a clarification question on this one:

> Using the SDK, your program prepares API requests describing pipelines to run, then sends them to the engine. The wire protocol used to communicate with the engine is private and not yet documented, but this will change in the future. For now, the SDK is the only documented API available to your program.

Does it mean the sdk is making a round trip to the dagger API remotely somewhere, or is the round trip to a locally running docker container?

> Does it mean the sdk is making a round trip to the dagger API remotely somewhere, or is the round trip to a locally running docker container?

The short answer, for now, is: "it's complicated" :) There's a detailed explanation of the Dagger Engine architecture here: https://github.com/dagger/dagger/issues/3595

To quote relevant parts:

> The engine is made of 2 parts: an API router, and a runner. > - The router serves API queries and dispatches individual operations to the runner. > - The runner talks to your OCI runtime to execute actual operations. This is basically a buildkit daemon + some glue. > The router currently runs on the client machine, whereas the runner is on a worker machine that will run the containers. This could be the same machine but typically isn’t.

> Eventually we will move the router to a server-side component, tightly coupled and co-located with the runner. This will be shipped as an OCI image which you will be able to provision, administer and upgrade yourself to your heart’s content. This requires non-trivial engineering work, in order to make the API router accessible remotely, and multi-tenant.

Hi, on the locally running engine.
This is very cool!

Also, from the post:

> Get started with the Dagger Go SDK, Dagger Python SDK, or let us know which SDK you're looking for.

Are you guys seeing many requests for a Dagger Rust SDK yet? :)

As someone who writes mostly in Rust, I’d love to get rid of yaml definitions of CI pipelines and instead define pipelines using Rust

Yes, we're getting a bunch of request from rustaceans
Hey there, this looks interesting. What's the scope of this project? Is it meant to be closer to developers or do you see this being used in production?

How is progress of builds, observability, etc being tackled?

> What's the scope of this project? Is it meant to be closer to developers or do you see this being used in production?

Dagger is meant for both development and production. Note that Dagger doesn't run your application itself: only the pipelines to build, test and deploy it. So, although the project is still young and pre-1.0, we expect it will be production-ready more quickly because of the nature of the workloads (running a pipeline is easier than running an app).

> How is progress of builds, observability, etc being tackled?

All Dagger SDKs target the same Dagger Engine. End-users and administrators can target the engine API directly, for logging, instrumentation, etc. The API is not yet publicly documented, but will be soon.

We're also building an optional cloud service, Dagger Cloud, that will provide a lot of these features as a "turnkey" software supply chain management platform.

A) Why async in the user code? Is it really necessary?

B) Can you mock pipeline events?

> A) Why async in the user code? Is it really necessary?

We support both sync and async mode. The Python ecosystem is in a state of flux at the moment between sync and async, so it seemed like the best approach to offer both and let developers choose.

> B) Can you mock pipeline events?

Could you share a bit more details on what you mean, to make sure I understand correctly?

>We support both sync and async mode. The Python ecosystem is in a state of flux at the moment between sync and async, so it seemed like the best approach to offer both and let developers choose.

Correct me if I'm wrong but it doesn't seem like async will really offer anything a pipeline framework might need?

>Could you share a bit more details on what you mean, to make sure I understand correctly?

I often find myself debugging pipeline complex release workflows that get triggered by a git tag or something.

If I use this thing with github, how can I debug that workflow when it fails?

Related question - as pipelines are not python programs - it would be nice to be able to write unit tests to pipelines code :) And that's where mock pipelines/tasks could be helpful
You can absolutely test Dagger pipelines, in the same way you test the rest of your code. Just use the dagger library in your tests files, the way you would use any other library. It should work the way you expect.
> A) Why async in the user code? Is it really necessary?

It's not a requirement, but it's simpler to default to one and mention the other. You can see an example of sync code in https://github.com/helderco/dagger-examples/blob/main/say_sy... and we'll add a guide in the docs website to explain the difference.

Why async?

It's more inclusive. If you want to run dagger from an async environment (say FastAPI), you don't want to run blocking code. You can run the whole pipeline in a thread, but not really taking advantage of the event loop. It's simpler to do the opposite because if you run in a sync environment (like all our examples, running from CLI), it's much easier to just spin an event loop with `anyio.run`.

It's more powerful. For most examples probably the difference is small, unless you're using a lot of async features. Just remove async/await keywords and the event loop. But you can easily reach for concurrency if there's benefit. While the dagger engine ensures most of the parallelism and efficiency, some pipelines can benefit from doing this at the language level. See this example where I'm testing a library (FastAPI) with multiple Python versions: https://github.com/helderco/dagger-examples/blob/main/test_c.... It has an obvious performance benefit compared to running "synchronously": https://github.com/helderco/dagger-examples/blob/main/test_m...

Dagger has a client and a server architecture, so you're sending requests through the network. This is an especially common use case for using async.

Async Python is on the rise. More and more libraries are supporting it, more users are getting to know it, and sometimes it feels very transitional. It's very hard to maintain both async and sync code. There's a lot of duplication because you need blocking and non-blocking versions for a lot of things like network requests, file operations and running subprocesses. But I've made quite an effort to support both and meet you where you're at. I especially took great care to hide the sync/async classes and methods behind common names so it's easy to change from one to another.

I'm very interested to know the community's adoption or preference of one vs the other. :)

don't forget about CUE please! Dagger is a good reason to learn CUE, my gut feeling so far is, my life is better with CUE experience.
Any reccs for moving into the IaC space?