Hacker News new | ask | show | jobs
by BoorishBears 1048 days ago
If someone says they're fine tuning a model (which is changing which layers are activated for a given input) it's generally well tolerated.

If someone says they're tuning a prompt (which is changing which layers are activated for a given input) it's met with extreme skepticism.

At the end of the day ML is probabilistic. You're always throwing random things at a black box and hoping for the best. There are strategies and patterns that work consistently enough (like ReACT) that they carry across many tasks, and there are some that you'll find for your specific task.

And just like any piece of software you define your scope well, test for things within that scope, and monitor for poor outputs.

2 comments

> If someone says they're fine tuning a model (which is changing which layers are activated for a given input) it's generally well tolerated.

> If someone says they're tuning a prompt (which is changing which layers are activated for a given input) it's met with extreme skepticism.

There are good reasons for that though. The first is the model-owner tuning so that given inputs yield better outputs (in theory for other users too). The second is relying on the user to diagnose and fix the error. That being the "fix" is a problem if the output is supposed to be useful to people who don't know the answers themselves, or if the model is being touted as "intelligence" with a natural language interface, which is where the scepticism comes in...

I mean, a bugfix, a recommendation not to use the 3rd menu option or a "fork this" button are all valid routes to change the runtime behaviour of a program!

(and yes, I get that the "tuning" might simply be creating the illusion that the model approaches wider usability, and that "fine tuning" might actually have worse side effects. So it's certainly reasonable to argue that when a company defines its models' scope as "advanced reasoning capabilities" the "tuning" might also deserve scepticism, and conversely if it defines its scope more narrowly as something like "code complete" there might be a bit more onus on the user to provide structured, valid inputs)

I'm not sure what this is trying to say.

Neither option implies you own the model or don't: OpenAI owns the model and uses prompt tuning for their website interface, which is why it changes more often than the underlying models themselves. They also let you fine tune their older models, which you don't own.

You also seem to be missing that in this context prompt tuning and fine tuning are both about downstream tasks where the "user" is not you as an individual who's fine tuning and improve prompts, but the people (plural) who are using the now improved outputs.

These aren't the contexts that invite the scepticism though (except when the prompt is revealed after blowing up Sydney-style!)

The "NN provided incorrect answer to simple puzzle; experts defend the proposition the model has excellent high-level reasoning ability by arguing user is 'not good at prompting'" context is, which (amid more legitimate gripes about whether the right model is being used) is what is happening in this thread.

ELI5 layers? Could someone like me see when I've used one layer as opposed to another, when using ChatGPT?
Technically I'm taking a large liberty saying you're "activating layers", all the layers are affecting the output and you don't pick and choose them

But you can imagine the model like a plinko board: just because the ball passes every peg, doesn't mean every peg changed it's trajectory.

When you fine tune a model, you're trying to change how the pegs are arranged so the ball falls through the board differently.

When you prompt tune you're changing how the ball will fall too. You don't get to change the board, but you can change where the ball starts or have the ball go through the board several more times than normal before the user sees it, etc.

You can't see the ball falling (which layers are doing what), only where it falls, but when you spend long enough building on these models, you do get an intuition for which prompts have an outsized effect on where the ball will land.