Personal experience building infrastructure and an autoscaler for these things (which performed horribly due to API lag) while having weekly meetings with our GitHub rep to get the --ephemeral flag in (took about a year longer than promised). Sometimes the exit code would be 0 when using `--once`, and sometimes it would not be. Sometimes it'd also be 0 if the token somehow didn't work and the worker couldn't even register no matter how often you restarted it (of course with a cryptic Azure IAM error code). Either way, we eventually just decided that throwing away the machine if the runner exists for any reason was safest.