| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by omneity 976 days ago

I feel like that would make it harder for a vendor to keep up with the industry.

Say you took all the effort in the world to build your custom LLM toolchain to train a Llama on custom hardware. And then suddenly someone comes up with LoRA. You didn't even finish porting it to your toolkit then someone comes up with GPTQ.

Can't keep up with a custom toolchain imo.

It's like a forked linux kernel. Eventually you're gonna have to upstream if you're serious about it, which is what AMD is actively doing with pytorch for ROCm (masquerading it as CUDA for compatibility).

1 comments

ronsor 976 days ago

I disagree. llama.cpp[0] is a good counterpoint to this, since it uses a custom ML framework created from scratch. Despite not having the developer team of a large company, it still keeps up with many of the advancements in LLMs.

[0] https://github.com/ggerganov/llama.cpp

link

elwypea 976 days ago

llama.cpp is not necessary for creating lots of demand for the chip it was originally written for (Apple M1), whereas new hardware vendors need to demonstrate they can plugin to existing tools to generate enough demand to ship in volume.

link

ronsor 975 days ago

> lots of demand for the chip it was originally written for (Apple M1)

To be fair, the M1/M2 chip can't be purchased or used separately from the Mac, unlike GPUs or socketed CPUs, and demand for Macs is already fairly high.

link