| > I believe environment variables are bad for the same reasons global variables are bad. Global variables are bad, but environment variables are actually more like dynamic variables: http://www.chriswarbo.net/blog/2021-04-08-env_vars.html Dynamic scope is useful for things the caller knows better than the implementor, e.g. configuration, credentials, etc. > Another alternative to consider for both env vars and config files are command line arguments The two things which distinguish CLI arguments from env vars are: - Env vars are usually readable from anywhere, whilst CLI args are usually passed around explicitly (more like lexical scope) - Env vars are inherently key=value pairs, whilst CLI arguments are better suited to checking presence/absence (e.g. 'foo' versus 'foo --force'), parameters which don't need names (e.g. 'foo myFile') and variable-length lists of parameters (e.g. 'foo file1 file2 file3') |
It did make me change my mind partially about "environment variables are bad for the same reasons global variables are bad." I concur that environment variables are more like constants than mutable globals, even in my language of choice, Python. If you only use them at process boundaries, they is fine, I admit using them that way too:
If they are used at a boundary within a process, however: Then testing foo_function() becomes a problem because os.environ isn't dynamically scoped within the process. Each test case can set os.environ["FOO"], but then the tests have mutable globals now even if the app doesn't. I know three ways to solve this, each with it's pros and cons:- 1. Treat the script as a black box, only test the script as a whole -- or not at all. How env vars are used internally doesn't matter. Works well for smaller scripts.
- 2. Keep the code as is, test functions individually by setting and resetting the environment variables in each test setup and teardown. Don't run tests in parallel.
- 3. Push all environment variable usage to process boundaries and make all inner functions pure functions that are only affected by their explicit input parameters. If needed, I even make standard in/out/error, logger instances and other similar globals explicit parameters or class members. Requires more boilerplate, works better for more complex projects. Testing any behavior becomes easier.
I prefer to go with option #1 or #, as #2 feels dirty and makes my test cases smell of workarounds. #3 could look such with few details omitted:
To agree with you, it would be great if the ex-globals-turned-parameters I'm passing around during option #3 would be dynamically scoped. Not shown in the example above, but imagine that instead of printing to sys.stderr, functions receive an stderr: io.IOBase parameter or a custom dataclass that contains such a field. The point is to get rid of mutable global state in all cases.To disagree with you, I think the correct term for "things the caller knows better than the implementor" are parameters. I'm not sure there's a benefit to preferring dynamic scope for parameters when most languages default to lexical scope.
About your last too points I somewhat agree and somewhat still disagree: "CLI args are usually passed around explicitly" -- I think this is a pro, not a con. Further, CLI arguments are strictly more flexible then environment variables, most argument parsing libraries support key-value parsing in addition to boolean flags and lists.
However, regarding your overall point that I understand as: environment variables used at process bounderies behave like dynamically scoped variables and these are fine. I agree, as long as they stay at process boundaries.