Hacker News new | ask | show | jobs
by jbergknoff 2237 days ago
> One of my favorite Python features is the way that the files and directories your application is made of map one-to-one with how you import and use them in code.

Funny to see this stated explicitly in this way. In my opinion, this is one of Python's biggest flaws (I'm a big fan of everything but the module system). Paths to files on disk should be treated as string literals, not magic unquoted strings that look like they're language keywords.

You can easily end up in situations where a directory in your project's tree has a name conflict with some library, and this causes issues, which is mind-bogglingly bad design (incidentally: also not a one-to-one map). If you don't live and breathe Python, and accidentally put a hyphen in a filename, God help you.

3 comments

There's also the thing where you used to need to have a magic __init__.py file for a folder to be a considered as a module, but now it's optional, although I still have no idea what the formula is for when Python decides something is a module…
It's not really optional. There's just this new thing called namespace packages which is not quite the same. I don't fully understand what it's for yet.
Namespace packages let you ship `foo.bar` and `foo.baz` separately. `bar` and `baz` are really two separate packages that are both part of the `foo` namespace. `foo` is not a real package.

They are not new but implicit namespace packages are new-ish. Before 3.3 it was more complicated to set up, now you just omit __init__.py from `foo`.

I don't see any reason to use this if `foo` is shipped as a single package.

If the implementation is hard to explain, it's a bad idea.

If the implementation is easy to explain, it may be a good idea.

This is right.
Name conflicts were basically fixed in 2003 with absolute/relative imports ( https://www.python.org/dev/peps/pep-0328/ ) - you don't ever have to include the name of your library in imports within your library using relative imports and via absolute imports they will never conflict with a base-level (site/system) package. Regarding hyphens and such in package names pretty much every language has restrictions on identifier names.
If I create a file in my project named requests.py, then a sibling file's `import requests` starts importing that file, instead of the library. Maybe name conflicts are "basically fixed" in the sense that there are ways to avoid them, but there is still surprising and magical (in the worse possible sense) behavior to trip over.

> Regarding hyphens and such in package names pretty much every language has restrictions on identifier names.

This particular restriction is an artifact of the bad design of treating paths on disk as special language tokens instead of string literals.

If your python files are not in a python package (do not have __init__.py in their directory) then they will be importing non-relatively and resolution depends on sys.path. If you have a multi-file python project in which your source files have inter-dependencies you should be structuring it as [nested] packages not as a directory of scripts that happen to try to talk to each other (and thus be doing 'from . import requests'). It's not that this is a 'way to avoid' a problem, it's just that this is the correct way to do this. I fully agree though that package management remains one of python's biggest pain points, made even worse by just how much bad and long-outdated information is out there, and by how 'easy' it is to just dump code into a .py and run it, not knowing just how wrong that is for anything but a temporary scratch file or a deliberately engineered self-contained single-file portable script like black.

Python doesn't treat 'paths on disk' as special language tokens, python chooses to handle certain tokens by possibly accessing files on disk. Classes in java (lacking 'modules') resolve to equally named files during compilation, as it goes in go, node, haskell, and pretty much everything I can think of with the sole exception of C/C++ preprocessor driven multi-file development (which is even more filesystem-coupled than the former as you literally type filenames).

Very good points, especially the bit about a conflict with a library which many of us here have likely experienced.

What would you do instead?

I think anything that treats paths on disks/library names as string literals would be workable. So, for instance, the stuff from the importlib standard library would probably be good enough if it was a top level/global construct that worked with quoted strings.

> from './app/utils.py' import numbers