Hacker News new | ask | show | jobs
by intrepidhero 2072 days ago
What I really want for python is a knob to improve startup time. I've imagined there must be a way to "statically link dependencies so that import isn't searching the disk but just loading from a fixed location/file. There doesn't seem to be many resources on the net. I've found this one: https://pythondev.readthedocs.io/startup_time.html. I tried using virtualenvs to limit my searchable import paths, and messed around with cython in effort to come up with a static linked binary. But I've yet to come up with anything that really improves the startup time. Clearly I have no idea what I'm doing.
8 comments

I once got quite a bit of startup time improvement by simply swapping out cpython's malloc calls for a version that took a large amount of resources at first (~5GB, can be tuned to your workload), and allocated from that. CPython makes many many thousands of mallocs at startup so this gave significant improvement.
This is a really interesting approach. I'd love to hear more. Were you patching cpython then? Is your work online somewhere?
See sibling. But yes I patched cpython, it’s pretty easy to build yourself.
Can you please share some numbers if you have them? How much improvement, etc.
60% decrease in startup time, 10% improvement in general runtime. Of course if you exceed the initial allocation the whole thing will crash. That could probably be worked around with a bit more intelligence in the allocator.

This was part of a class project so not available online unfortunately. It’s good practice to implement it yourself though! There are lots of resources online for implementing fast allocators.

>Clearly I have no idea what I'm doing.

Doesn't look like that from over here.

Many times the difference between failure and the magic spell working is 1 more late night iteration. In this specific case you are working against some difficult constraints that are deep in the language. That said, there is almost always a way to side-step a problem altogether. You may find that one workaround is to amortize the startup concern over time - I.e. reorient the problem domain so you only have start the python process once a day. Or, find a way to defer loading of required components until the runtime actually needs them.

It is trivial in Python to move towards a lazier way to load modules on first use, it's just not idiomatic or too readable (thus we do top-level imports).
If you want to improve startup for a script that you use frequently consider using one of the app builders such as PyOxidizer[1]. They do work to improve startup by embedding all the modules in the binary and then loading them from memory.

[1] https://pyoxidizer.readthedocs.io/en/latest/index.html

What I found is that Python is not that terrible (interpreter starts in 60ms on my laptop and imports an empty local file: `touch empty.py; time python3 -c 'import empty'`).

However, idiomatic Python shortcuts to expose everything at the top level (star imports or imports of everything in the top-level __init__.py) cause everything to be imported everywhere. __all__ is all but forgotten, so importing things like flask, sqlalchemy, requests and similar will take anywhere from 100-500ms each, even if you just need a single function from a submodule.

Worst offenders are things which embed their own copy of requests (likely for reproducible builds) taking upwards of 800ms just to import even if your project already imported requests directly.

I don't think it has anything to do with search paths, but simply with loading and executing hundreds of files. If you need those modules, Python will read them. Perhaps moving your venv to a "ramdisk" might help?

Have you established that searching for modules is slow? I think it just takes time to actually process the imported modules and load everything into memory.
On Windows (what with atrocious NTFS performance and all) an interpreter that's using a zipped library is way faster than one using loose modules.
No. In fact my experiments suggest otherwise. That was just where my intuition lead me.
You can compile your Python code using Nuitka, the resulting binary has much better startup time. I do this for a couple of command line tools.
Couldn't a Python implementation in Truffle (GraalVM) solve the startup problem?
That's GraalPython.

    python -s [-S]