Hacker News new | ask | show | jobs
by botro 712 days ago
Thanks for sharing this, It's well written and informative. I noticed you used 'temperature=1' in the GPT test for the example in the post. Is this best practice for a task requiring structured output? Have you tested other temperature settings? My casual understanding was that a temperature of 0 is best for these types of workloads while higher temperatures would be more effective for more 'creative' workloads.
1 comments

I followed whatever the guidance was for a specific model. Some of the LLM finetuning providers did indeed set the temperature to 0 and I followed that, but others suggested 1. I could probably iterate a bit to see what is best for each model, and I might well do that for the one that I choose as the one I’ll be doubling down on in subsequent iterations / finetunes. Thanks for the suggestion!
GPT models shouldn't be used at temp 1 unless you only care about creative writing. They get much worse at factual stuff and code than with lower temperatures. And yes, 3.5 Turbo is less affected by this, which might be the reason why the models performed for you in reverse.
For GPT, I would really urge to try again with 0. 1 kind of starts to force it to fail.

I would say this actually invalidates the whole thing.

You never use 1 for stuff like this. 1 is for poetry and creative writing. You need to redo this with temp=0 imo.