|
|
|
|
|
by superkuh
1166 days ago
|
|
I think you have it backwards. The python (ie, huggingface, etc) implementations of transformers are the complex ones with dependency hell so bad even there's even a layer of package manager / env hell. This version of fastchat (there's 2) required a particular commit of huggingface libs for quite a while. Something that only changed recently. And it'll happen again in the future. Python just hides this complexity... until it doesn't. Like beautiful but rapidly rotting fruit. llama.cpp will remain a single two line project (git clone https://github.com/ggerganov/llama.cpp, make -j) that will compile easily and run on anything. No external deps to pin to a particular commit (that will only have a lifetime of some months) as things change rapidly. That said, the changes in the ggml weights format the last 2 weeks were annoying, but now that the mmap-style weights are settled on it should be less converting. In that sense huggingface wins, it only has two incompatible weights formats. llama.cpp's ggml has had 3. |
|
also nice to see you again, superkuh. I frequented your IRC channel about a decade ago.