| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dartos 1014 days ago
	Yeah. 224 MB is nothing. Small LLMs that have been quantized to optimize for space still sit at 4-9 gigs

3 comments

hadlock 1014 days ago

I would expect LLM hardware to routinely support between 32 and 512GB memory in the Very Near Future. 1-4TB by the end of the decade. Custom hardware for GPT and LLM technology only started being developed in earnest in September 2022

link

wrigglingworm 1014 days ago

I thought the article was mixing up MB and GB at first. Is that the size of the LLM with no training data? Is there any value in that?

link

ben_w 1014 days ago

Once the model is trained, it doesn't need to keep the training data around.

GPT-3 is about 175 billion parameters (though I have no idea how many bits per parameter OpenAI uses at inference-time), and is apparently trained on 45 TB of data[0]

[0] Caution: citation was first hit on google, YMMV — https://www.springboard.com/blog/data-science/machine-learni...

link

hadlock 1014 days ago

Presumably if you are training a robot to use a new/different tool you'll want the ability to train on site. If you buy an iHop restaurant the pancake robot in the kitchen ought to be able to be repurposed as a hamburger robot for your cheeseburger business. Omlette scrambling robots could be trained to mix small batches of cookie dough. Etc etc. Toyota is working on developing a framework for this already.

link

ben_w 1014 days ago

Some robot firms are integrating LLMs to make the robots more general: https://www.youtube.com/watch?v=Vq_DcZ_xc_E&t=2s

On-site training is… not really solved yet. Not efficiently, at any rate: any task can be trained with sufficient compute and/or examples, but probably more than most companies would care to bother with, and certainly more than we'd get onto one of the chips in the article.

That's not to diss the chips: As I understand it, one of the biggest issues is the power envelope of mobile units, which means making the computations more energy efficient is going to help massively, it's just that "training" and "inference" are currently very distinct tasks with very different hardware requirements.

(Also, I'm not sure if you mean those examples as illustrations or are serious about them: if you're serious, I suspect an old-fashioned robot arm bolted to the ground and following a pre-programmed path will probably cover your needs — GOFAI is great in restricted domains, the more modern AI models are more appropriate when the environment is more chaotic and less predictable, such as collaborating in a kitchen that also has humans or being asked on the fly to do a new recipe it's never encountered before).

link

andy_ppp 1014 days ago

These are the trained weights and biases, the training data is unknown in size but could be terabytes… I’ve no idea how to even guess at the size of the training data but that doesn’t all need to in ram at the same time.

link

loufe 1014 days ago

That's not true, there are some very pruned (and relatively dumb) LLMs which go under a gig.

link

rubatuga 1014 days ago

Small LLMs or large small language models???

link

andy_ppp 1014 days ago

What’s the point of a state of the art AI chip that can’t run large models? It seems problematic to say the least!

link

superjan 1014 days ago

It is a new design, not a direct competitor for nVidia’s flagship. And there are lots more applications for AI than LLMs.

link

nomel 1014 days ago

Realtime, low power, isn't really possible today. Think anything motion or reaction related.

link

jacobgorm 1014 days ago

Some of us still work on computer vision. 224MiB is fairly massive for convolutional neural network.

link

Jensson 1014 days ago

It can run small models very fast.

link