| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by lazaroclapp 2198 days ago

> When you set a feature flag, normally you should put a deadline on it, after this deadline, you should either choose to keep it, or remove it or you can extend the deadline.

This is actually a feature we did identify as important to have in order to increase Piranha's effectiveness, and it is being added to our internal flag tracking, but it doesn't negate the need for the tool. An expiration date makes it easier for the tool to know when to run on a given feature (rather than using heuristics based on % rollout and days without changing the experiment), it still means that Piranha: a) reduces manual effort by auto-generating the (candidate) removal patch, and b) acts as the reminder portion for the expiration, so it's more actionable than just adding a task.

The thing to note is that, even on a steady state where flags are removed as soon as they go stale, enough new experiments are being created every day that reducing the time spent cleaning them up is valuable.

Also, you definitely don't want to block someone from fixing a crash because they have pending expired flags, so all you can really do with any expiration policy is to remind them. With Piranha, you are reminding them and reducing the friction to solve the issue. After all, the diff is right there for them to review and click 'land' on.

As for the hardcoded flag value in your example above? What does that accomplish? It looks to me like you'd only be shipping dead code, since there is no runtime way of re-enabling the `RIDES_NEW_FEATURE` behavior (which is the main difference between "rolled 100%" vs "Piranha-removed"). It also makes harder to remove the related code later, since the semantic information about it being part of the feature is lost. If it's just about having the old code available, then version control does that already, no? What am I missing?

1 comments

bluesign 2198 days ago

I am in favor of all automation and I totally agree that the tool is valuable. I am sorry if my comments seemed on other direction.

What I was trying to say was basically, without tooling also you can manage the debt from feature flags.

Answer to the RIDES_NEW_FEATURE question is about "you definitely don't want to block someone from fixing a crash because they have pending expired flags" mainly.

When I am disabling a flag, if I set isNewFeatureEnabled to False, basically, I am removing bloat instantly. Then when I have time to review the code I can also remove the dead code. Actually this is fixing the concerns in your blog post about "accidental activation" and "bloat" without waiting developers to fix the code. Piranha can set flag to stale value, then later can send the developer task to fix the dead code.

My flow is little bit more complicated then I replied actually, I have also assets etc related to feature flags. CI pipeline also removing non-used assets for that flag when it is stale. So basically I have more like: function isNewFeatureEnabled { return isFlagEnabled(flag) && getFlagValue("RIDES_NEW_FEATURE") }

What I am curious on this topic, do you have any kind of conflict detection for your feature flags?

link

mkr-plse 2197 days ago

BTW, we also have a tech publication at https://github.com/uber/piranha/blob/master/report.pdf where we discuss some of the design tradeoffs pertaining to the stale flag cleanup problem.

Can you elaborate on what you mean by conflict detection? Are you trying to understand how flags are dependent on each other?

link

bluesign 2197 days ago

Thanks a lot, I will read it asap.

Basically sometimes I have some conflicting flags that can introduce bugs. Especially some rarely used flag and a new feature. Basic example, 2 different flags, setting same property to different values on an UI object.

Although more testing coverage probably can help, but I am curious, if you have some automation to detect those cases.

link

mkr-plse 2197 days ago

Interesting. We don't have tooling for this yet but extending the Piranha static analysis may help detect the issue.

link