| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sandkoan 1109 days ago
	You wouldn't actually want to, because you'd be losing generalizability, and it's a lot of unnecessary work. I think approach #1 outlined above is the better (more cost- and time-efficient) technique—where a pretrained model already understands JSON (among myriad other formats), and you merely constrain it at text-gen time to valid JSON (or other format).

1 comments

sandGorgon 1108 days ago

so im not sure what is the difference between what you wrote and what i wrote ? are you distinguishing between "pretrained models" (as in base models) and finetuned models ?

here's my question then - was the GPT 0613 update (which introduced functions) a completely new base model or simply a finetuned model ? it seems to be the latter.

link

sandkoan 1108 days ago

Yeah, then it seems we agree. I was just pointing out that it's not necessary to finetune OSS models to behave like OpenAI functions if you're able to do something similar to what we did (no tuning involved!).

link