Hacker News new | ask | show | jobs
by CharlesW 1194 days ago
I'm a Python neophyte, but I've read that Python 3.11 is 10-60% faster than 3.10, so that may be a consideration.
1 comments

In this particular case that doesn't matter, because the only time you run Python is for a one-off conversion against the model files.

That takes at most a minute to run, but once converted you'll never need to run it again. Actual llama.cpp model inference uses compiled C++ code with no Python involved at all.