| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by superkuh 1213 days ago

I think you have it backwards. The python (ie, huggingface, etc) implementations of transformers are the complex ones with dependency hell so bad even there's even a layer of package manager / env hell. This version of fastchat (there's 2) required a particular commit of huggingface libs for quite a while. Something that only changed recently. And it'll happen again in the future. Python just hides this complexity... until it doesn't. Like beautiful but rapidly rotting fruit.

llama.cpp will remain a single two line project (git clone https://github.com/ggerganov/llama.cpp, make -j) that will compile easily and run on anything. No external deps to pin to a particular commit (that will only have a lifetime of some months) as things change rapidly.

That said, the changes in the ggml weights format the last 2 weeks were annoying, but now that the mmap-style weights are settled on it should be less converting. In that sense huggingface wins, it only has two incompatible weights formats. llama.cpp's ggml has had 3.

3 comments

sterlind 1213 days ago

I've spent the past couple days packaging an LLM playground environment as a Nix expression. it's been pure hell.

also nice to see you again, superkuh. I frequented your IRC channel about a decade ago.

link

superkuh 1213 days ago

Using nix and then complaining about having to set up your compilation environment libs/etc is kind of like sticking a rod in your bike's wheel spokes and complaining about crashing. Don't give up on the idea of system libraries (ie, use nix) and this doesn't happen.

Also, hi? I don't recall you by that nick but the internet is a small place sometimes.

link

sterlind 1213 days ago

oh, I'm very aware that I've brought this upon myself, but I'm sticking out for the greater good (and stubbornness.)

specifically, I'm trying to benchmark a bunch of different GPU configurations on different workloads on vast.ai, which uses Docker containers. I abhor Dockerfiles and my experience building containers with nix has been pleasant, so that's what I'm doing and why. fortunately I think I'm getting past the learning curve.

did our channel survive the demise of freenode? I was andares, I think I used to be annoying but I've gotten better.

link

superkuh 1213 days ago

Ah. Hi! Yes. We still exist in the same place but on libera now.

link

anotherhue 1212 days ago

Care to share some of your progress? I have similar (stronger?) feelings regarding Dockerfile's big-ball-of-state nonsense.

(The irony of holding this opinion while dealing with pre-trained AI models is not lost)

link

sterlind 1212 days ago

I finished my work on poetry2nix and submitted a PR which works perfectly (at least with preferWheels=true.) now I have a wonderful live environment with torch, triton, transformers, etc. Docker builds are fast and lightweight since I use buildLayeredImage. it is, truly, the promised land my forefathers prophesized.

link

siraben 1213 days ago

Have you been successful in getting the LLM playground up with Nix?

link

sterlind 1213 days ago

yes, almost! I used poetry2nix and grafted a bunch of overrides to fix the torch-2.0 build, and I just got cuda working with it. I'm testing triton now. I'll submit my PR to poetry2nix so watch that space if you want it.

link

zhisbug 1213 days ago

no, the requirement on a particular HF commit has been fixed. It is no longer needed.

link

superkuh 1213 days ago

Right. That particular problem has been fixed. But the fact that it was needed indicates it will happen again. It exposes the underlying complexity of the huggingface transformer stack. It's wonderful code, don't get me wrong. It's just the furthest thing possible from the least complex.

link

zhisbug 1213 days ago

it is really a matter of having faith on pytorch (or JAX) or on third-party cross-platform supports like llama-cpp. Apparently pytorch reduces a lot of complexity and grows extremely faster on cross-platform supports.

And, PyTorch does so well on GPUs!

link

Casteil 1213 days ago

This has been my experience so far as well. GPT4All feels pretty fragile with all its dependencies.

link