| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by omneity 975 days ago
	Pytorch XLA is such a pain to use. And once you go TPU you need the same energy to switch back, so you can’t quickly test out how it performs on your problem.

1 comments

lumost 975 days ago

One of the big reasons custom hardware solutions struggle.

IMO - you’d have better luck as a hardware vendor implementing an LLM toolchain and bypassing a general purpose DL framework. At the very least you should be able to post impressive results with this approach rather than a half baked pytorch port.

link

omneity 975 days ago

I feel like that would make it harder for a vendor to keep up with the industry.

Say you took all the effort in the world to build your custom LLM toolchain to train a Llama on custom hardware. And then suddenly someone comes up with LoRA. You didn't even finish porting it to your toolkit then someone comes up with GPTQ.

Can't keep up with a custom toolchain imo.

It's like a forked linux kernel. Eventually you're gonna have to upstream if you're serious about it, which is what AMD is actively doing with pytorch for ROCm (masquerading it as CUDA for compatibility).

link

ronsor 974 days ago

I disagree. llama.cpp[0] is a good counterpoint to this, since it uses a custom ML framework created from scratch. Despite not having the developer team of a large company, it still keeps up with many of the advancements in LLMs.

[0] https://github.com/ggerganov/llama.cpp

link

elwypea 974 days ago

llama.cpp is not necessary for creating lots of demand for the chip it was originally written for (Apple M1), whereas new hardware vendors need to demonstrate they can plugin to existing tools to generate enough demand to ship in volume.

link

ronsor 973 days ago

> lots of demand for the chip it was originally written for (Apple M1)

To be fair, the M1/M2 chip can't be purchased or used separately from the Mac, unlike GPUs or socketed CPUs, and demand for Macs is already fairly high.

link

elwypea 975 days ago

That might be good enough to get a hardware startup acquired, but not good enough to get major sales. Users want pytorch and negligible switching cost between chips.

Bigger problem for startups trying to muscle in on LLMs is that there isn't much room for improvement on existing solutions to do something radically different.

link

lumost 974 days ago

>Bigger problem for startups trying to muscle in on LLMs is that there isn't much room for improvement on existing solutions to do something radically different.

aye - unless you are able to notch a 10x cost/performance improvement. The migration overhead will just make it not worth it to switch.

link