Hacker News new | ask | show | jobs
by sandkoan 1062 days ago
You wouldn't actually want to, because you'd be losing generalizability, and it's a lot of unnecessary work.

I think approach #1 outlined above is the better (more cost- and time-efficient) technique—where a pretrained model already understands JSON (among myriad other formats), and you merely constrain it at text-gen time to valid JSON (or other format).

1 comments

so im not sure what is the difference between what you wrote and what i wrote ? are you distinguishing between "pretrained models" (as in base models) and finetuned models ?

here's my question then - was the GPT 0613 update (which introduced functions) a completely new base model or simply a finetuned model ? it seems to be the latter.

Yeah, then it seems we agree. I was just pointing out that it's not necessary to finetune OSS models to behave like OpenAI functions if you're able to do something similar to what we did (no tuning involved!).