Note that this effect varies across builds and versions. For most of Python's history, we could give clear and invariant optimization advice (locals and nonlocals are fastest, global variable access was at least twice as slow, and builtin variable access was even slower). That ordering will likely remain true but absolute speeds have improved dramatically and the ratios have shifted).
Here is a run of Tools/scripts/var_access_benchmark.py for Python 3.12rc1 stock build for an Apple M1 Max (your mileage may vary):
Here is a run of Tools/scripts/var_access_benchmark.py for Python 3.12rc1 stock build for an Apple M1 Max (your mileage may vary):