Hacker News new | ask | show | jobs
by dartos 968 days ago
Yeah. 224 MB is nothing.

Small LLMs that have been quantized to optimize for space still sit at 4-9 gigs

3 comments

I would expect LLM hardware to routinely support between 32 and 512GB memory in the Very Near Future. 1-4TB by the end of the decade. Custom hardware for GPT and LLM technology only started being developed in earnest in September 2022
I thought the article was mixing up MB and GB at first. Is that the size of the LLM with no training data? Is there any value in that?
Once the model is trained, it doesn't need to keep the training data around.

GPT-3 is about 175 billion parameters (though I have no idea how many bits per parameter OpenAI uses at inference-time), and is apparently trained on 45 TB of data[0]

[0] Caution: citation was first hit on google, YMMV — https://www.springboard.com/blog/data-science/machine-learni...

Presumably if you are training a robot to use a new/different tool you'll want the ability to train on site. If you buy an iHop restaurant the pancake robot in the kitchen ought to be able to be repurposed as a hamburger robot for your cheeseburger business. Omlette scrambling robots could be trained to mix small batches of cookie dough. Etc etc. Toyota is working on developing a framework for this already.
Some robot firms are integrating LLMs to make the robots more general: https://www.youtube.com/watch?v=Vq_DcZ_xc_E&t=2s

On-site training is… not really solved yet. Not efficiently, at any rate: any task can be trained with sufficient compute and/or examples, but probably more than most companies would care to bother with, and certainly more than we'd get onto one of the chips in the article.

That's not to diss the chips: As I understand it, one of the biggest issues is the power envelope of mobile units, which means making the computations more energy efficient is going to help massively, it's just that "training" and "inference" are currently very distinct tasks with very different hardware requirements.

(Also, I'm not sure if you mean those examples as illustrations or are serious about them: if you're serious, I suspect an old-fashioned robot arm bolted to the ground and following a pre-programmed path will probably cover your needs — GOFAI is great in restricted domains, the more modern AI models are more appropriate when the environment is more chaotic and less predictable, such as collaborating in a kitchen that also has humans or being asked on the fly to do a new recipe it's never encountered before).

These are the trained weights and biases, the training data is unknown in size but could be terabytes… I’ve no idea how to even guess at the size of the training data but that doesn’t all need to in ram at the same time.
That's not true, there are some very pruned (and relatively dumb) LLMs which go under a gig.
Small LLMs or large small language models???
What’s the point of a state of the art AI chip that can’t run large models? It seems problematic to say the least!
It is a new design, not a direct competitor for nVidia’s flagship. And there are lots more applications for AI than LLMs.
Realtime, low power, isn't really possible today. Think anything motion or reaction related.
Some of us still work on computer vision. 224MiB is fairly massive for convolutional neural network.
It can run small models very fast.