Hacker News new | ask | show | jobs
by rendaw 52 days ago
How is this different from Terraform? Generally if something fails during a TF apply it saves the state of all the stuff that worked and just retries the thing that failed when you next run it. And reverting your TF stack and doing apply again should walk changes back.

There are specific things where that's not possible, and there are bugs, but it doesn't seem like what you said unless you meant that you just support a limited subset of resources that are known to be robust to reverts? But that's a fairly different claim.

1 comments

The main difference is granularity. Terraform runs a plan and applies it as a batch. If something fails, you re-run apply and it retries from the last saved state... but that state is per-resource, not per-API-call.

Alien tracks state at the individual API call level. A single resource creation might involve 5-10 API calls (create IAM role -> attach policy -> create function -> configure triggers -> set up DNS...). If it fails at step 7, it resumes from step 7. Terraform would retry the entire resource.

The other difference is that Alien runs continuously, not as a one-shot apply. It's a long-running control plane that watches the environment, detects drift, and reconciles. Terraform assumes you run it, it converges, and then nothing changes until you run it again.

Speaking of granularity, I noticed that the 2 states of a resource seem to be:

> Frozen: Alien can only monitor it. Created once during setup, then Alien has no permissions to modify or delete it. > Live: Alien can manage it from your cloud. Push code updates, roll config changes, redeploy — without the customer's involvement.

Is that really all? What about something like "Alien can run these 37 maintenance and debugging commands but cannot touch the firewall or modify routes or change any other access methods to internal resources"?

(I'm looking at https://www.alien.dev/docs/how-alien-works here.)