| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by xeox538 435 days ago
	I believe we're currently seeing AI in the "mainframe" era, much like the early days of computing, where a single machine occupied an entire room and consumed massive amounts of power, yet offered less compute than what now fits in a smartphone. I expect rapid progress in both model efficiency and hardware specialization. Local inference on edge devices, using chips designed specifically for AI workloads, will drastically reduce energy consumption for the majority of tasks. This shift will free up large-scale compute resources to focus on truly complex scientific problems, which seems like a worthwhile goal to me.

3 comments

whatnow37373 435 days ago

The CPU development curve is often thrown around but it very seldomly fits anything else in reality. It was a very rare and extraordinary set of coincidences that got it us here. Computation using silicon turned out to have massive growth potential for a variety of lucky reasons but say battery tech is not so lucky, nor is fusion nor is quantum computing.

The low hanging fruit has been plucked by said silicon development process and while remarkable improvement in AI efficiency is likely it is highly unlikely for that to follow a similar curve.

More likely is slow, incremental process taking decades. We cannot just wish away billions of parameters and the need for trillions of operations. It’s not like we have some open path of possible improvement like with silicon. We walked that path already.

Maybe photonics..

link

righthand 435 days ago

I don’t understand the “chips designed for AI workloads” sentiment I hear all the time. Llms were designed using Gpus. The hardware already exists, so what will make it use less energy in a world where Gpus over the last decade have only become bigger, hotter, more power hungry hardware? If we could develop Llm on anything less we probably would have shifted back to Cpus already.

link

minimaxir 435 days ago

Google's TPUs are an example of efficient chips designed for AI workloads, and were in development before the LLM boom.

It's just hard to replicate the power and efficiency of CUDA.

link

panstromek 435 days ago

It sure seems like that to me. I was pretty impressed by how easily I could run small Gemma on 7 year old laptop and get a decent chat experience.

I can imagine that doing some clever offloading to a normal programs and using the LLM as a sort of "fuzzy glue" for the rest could improve the efficiency on many common tasks.

link

coolcase 435 days ago

Big tech ain't investing heavily so you can run local, what data does that leave them to sell, and what power and control does that give them. Zilch.

link

panstromek 434 days ago

I mean.. cute conspiracy but it doesn't correspond with reality. Just look what's Google releasing, they are trying to make these things fit on consumer hardware.

link