| Hi Chris! Thanks for the link, it's an enlightening read, I learned about dynamic variable scopes today. It did make me change my mind partially about "environment variables are bad for the same reasons global variables are bad." I concur that environment variables are more like constants than mutable globals, even in my language of choice, Python. If you only use them at process boundaries, they is fine, I admit using them that way too: parser = argparse.ArgumentParser()
parser.add_argument("--foo", default=os.environ.get("FOO"))
If they are used at a boundary within a process, however: def foo_function():
return foo_implementation(os.environ.get("FOO"))
Then testing foo_function() becomes a problem because os.environ isn't dynamically scoped within the process. Each test case can set os.environ["FOO"], but then the tests have mutable globals now even if the app doesn't. I know three ways to solve this, each with it's pros and cons:- 1. Treat the script as a black box, only test the script as a whole -- or not at all. How env vars are used internally doesn't matter. Works well for smaller scripts. - 2. Keep the code as is, test functions individually by setting and resetting the environment variables in each test setup and teardown. Don't run tests in parallel. - 3. Push all environment variable usage to process boundaries and make all inner functions pure functions that are only affected by their explicit input parameters. If needed, I even make standard in/out/error, logger instances and other similar globals explicit parameters or class members. Requires more boilerplate, works better for more complex projects. Testing any behavior becomes easier. I prefer to go with option #1 or #, as #2 feels dirty and makes my test cases smell of workarounds. #3 could look such with few details omitted: parser = argparse.ArgumentParser()
parser.add_argument("--foo", default=os.environ.get("FOO"))
args = parser.parse_args()
def foo_function(foo_value):
return foo_implementation(foo_value)
def main():
...
foo_result = foo_function(foo_value=args.foo)
...
...
To agree with you, it would be great if the ex-globals-turned-parameters I'm passing around during option #3 would be dynamically scoped. Not shown in the example above, but imagine that instead of printing to sys.stderr, functions receive an stderr: io.IOBase parameter or a custom dataclass that contains such a field. The point is to get rid of mutable global state in all cases.To disagree with you, I think the correct term for "things the caller knows better than the implementor" are parameters. I'm not sure there's a benefit to preferring dynamic scope for parameters when most languages default to lexical scope. About your last too points I somewhat agree and somewhat still disagree: "CLI args are usually passed around explicitly" -- I think this is a pro, not a con. Further, CLI arguments are strictly more flexible then environment variables, most argument parsing libraries support key-value parsing in addition to boolean flags and lists. However, regarding your overall point that I understand as: environment variables used at process bounderies behave like dynamically scoped variables and these are fine. I agree, as long as they stay at process boundaries. |
Sure; I never said it's a con. They have different characteristics, and are both useful in certain situations :)
> I think the correct term for "things the caller knows better than the implementor" are parameters.
True; that's also the name Racket gives to dynamically-scoped variables https://docs.racket-lang.org/guide/parameterize.html
In fact, Racket uses a parameter (dynamically-scoped variable) to store the environment. This is actually slightly annoying, since the parameter is one big hashmap of all the env vars; but I usually want to override them individually. One of my Racket projects actually defines a helper function to override individual env vars makes a copies all the other environment ( made a are contained in a parameterhttps://github.com/Warbo/theory-exploration-benchmarks/blob/...