| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ford 1034 days ago
	So far it's been ~8 months since ChatGPT started the (popular) LLM craze. I've found raw GPT to be useful for a lot of things, but have yet to see my most frequently used apps integrate it in a useful way. Maybe I'm using the wrong apps... It'll be interesting to see what improvements (in a lab or at a company) need to happen before most people use purpose-built LLMs (or behind the scenes LLM prompts) in the apps they use every day. The answer might be "no improvements" and we're just in the lag time before useful features can be built

2 comments

Legend2440 1034 days ago

There are some unsolved practical problems like prompt injection, the difficulty of using them on your own data, etc.

But the biggest problem is that they take so much compute, which slows down both research and deployment. Only a handful of giant companies can train their own LLM, and it's a major undertaking even for them. Academic researchers and everyday tinkerers can only run inference on pretrained models.

link

p1esk 1034 days ago

Sounds like a great motivation for academic researchers to find a way to train LLMs with less compute. Or maybe invent something better than transformers. A brain trains on 20 Watts after all.

link

Legend2440 1034 days ago

That's a hardware difference. Brains run at a very low clock speed and make up for it with massive parallelism. They also don't suffer from the vonn neumann bottleneck - today's computers spend most of their time and energy shuffling the network in and out of memory.

I believe that better hardware architectures will have more impact on AI than better neural network architectures.

link

zare_st 1033 days ago

Today's computers don't spend most of their time in I/O. Average CPU runs idle most of the time. I/O does not require CPU power. DMA exists since the 80s or whenever. Software and segmentation limits such as operating systems and traversal between execution rings are not the chemical hard barrier brains have.

You're correct that our architecture isn't adequate and biggest achievements lie there. I/O is not the problem, in fact, we have faster I/O. Because our I/O is dumb. We can place massive amounts of data in a linear memory buffer. But brains use massively associated memory structures. I/O of a network packet is easy. Associating that packet with preexisting context (such as TCP connection) is not that easy, requires structures, algorithms, memory locality, threading correctness, and procedural computing steps, because we abstract the context over a series of flat data.

If you're working on a subject hard, just a random flying info about something else that concerns you might trigger "I can't think about that right now" reaction in your brain, but the information has been digested. The packet has reached the adequate layer 7 ingress buffer just like that, but you don't want to context switch to the respective application intentionally.

There is also an elephant in the room and that is the native language, which shapes the way we think and process information. Imagine a CPU receiving an automatic microcode update the same moment when you as a programmer defined an abstract TCP stack in C or assembler, so it can optimize itself to the point of being able to switching to "thinking in TCP" mode.

link

p1esk 1033 days ago

Pretend you have any hardware you want, today. What would you do with it? What model would you train? How do you know available hardware is the bottleneck and not model architecture?

link

Legend2440 1033 days ago

Because with infinite hardware I'd be able to do neural architecture search and find the optimal model architecture.

And I'd be able to train a learned optimizer to replace gradient descent as the training process.

Even without either of those, performance improves in a predictable way with more compute thanks to scaling laws.

link

Silphendio 1033 days ago

Neuromorphic Computing is already a thing, and Intel's already developing chips (Loihi 2). But it's not as powerful as GPU's yet, and it only runs spiking neural networks.

link

pishpash 1034 days ago

That would be FPGA's.

link

josephg 1034 days ago

I doubt it. FPGAs are super inefficient in transistor count in exchange for being dynamically programmable. I suspect a better architecture will be taped out like any other chip.

link

netdur 1034 days ago

I have helped making behind sense cases, one was to classify emails and redirect them to intended sides, second was quality monitoring of call center.

link