|
|
|
|
|
by simonw
1194 days ago
|
|
In this particular case that doesn't matter, because the only time you run Python is for a one-off conversion against the model files. That takes at most a minute to run, but once converted you'll never need to run it again. Actual llama.cpp model inference uses compiled C++ code with no Python involved at all. |
|