| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by declaredapple 854 days ago

What?

Are you asking if the framework automatically quantizes/prunes the model on the fly?

Or are you suggesting the LLM itself should realize it's too big to run, and prune/quantize itself? Your references to "intelligent" almost leads me to the conclusion that you think the LLM should prune itself. Not only is this a chicken and egg problem, but LLMs are statistical models, they aren't inherently self bootstraping.

2 comments

dheera 854 days ago

I realize that, but I do think it's doable to bootstrap it on a cluster and teach itself to self-prune, and surprised nobody is actively working on this.

I hate software that complains (about dependencies, resources) when you try to run it and I think that should be one of the first use cases for LLMs to get L5 autonomous software installation and execution.

link

Red_Leaves_Flyy 854 days ago

Make your dreams a reality!

link

lobocinza 847 days ago

Worst is software that doesn't complain but fails silently.

link

2099miles 854 days ago

The LLM itself should realize it’s too big and only put the important parts on the gpu. If you’re asking questions about literature there’s no need to have all the params on the gpu, just tell it to put only the ones for literature on there.

link