| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by nico 508 days ago

This is a fascinating concept, ie. modifying trained LLMs to create different models

Do these techniques train models while performing the modifications?

Are there pre-trained models that “know how to” modify LLMs for certain goals?

It would be amazing to have models that could strip LLMs to some very basic small model of whatever I want. Like reducing an LLM to something that just knows some basic “American English”, then running that on CPU

1 comments

tsadoq 507 days ago

> Do these techniques train models while performing the modifications?

Depend on what you mean by training, they change the weights.

> Do these techniques train models while performing the modifications?

I'm not sure I understand, but there is an example of performing an obliteration on gemma to make it never refuse an answer. It's about 10 lines of code.

nico 507 days ago

> > Do these techniques train models while performing the modifications?

> Depend on what you mean by training, they change the weights.

What I wonder: is there a separate model, not the LLM, that gets trained only on how to modify LLMs?

I imagine a model that could learn something like: “if I remove this whole network here, then the LLM runs 50% faster, but drops 30% in accuracy for certain topics”, or “if I add these connections, the LLM will now be able to solve more complex mathematical problems”

So a model that is not an LLM, but is trained on how to modify them for certain goals

Is that how this tool works?