Hacker News new | ask | show | jobs
by Vetch 823 days ago
> Are you sure? I think "Open"AI uses the chat transcripts to help the next training run?

> Fine-tuning.

The learning that occurs through SGD is proven to be less flexible and generalizing than what happens via context. This is due to the restricted way information flows through transformers and which is further worsened in autoregressive GPTs vs models with bidirectional encoders.

On top of that, SGD already requires a great many examples per concept and, the impact of any single example rapidly diminishes as learning rate tampers down as training ends. Finetuning a fully trained model is far less efficient, more crippled when compared to learning from context for introducing new knowledge. It's believed that instruction tuning helps reduce uncertainty in token selection more than it introduces new knowledge.

> Co-pilot gets to watch people figure stuff out

We don't actually know if that's true. It depends on how many intermediate steps Microsoft records as training data. If enough intermediate steps lead to bad results and needed backtracking, but that erasure is not captured, it will significantly harm model quality. It is not nearly as easy to do well as you make it seem.

All in all, getting online learning into models has proven very challenging. While some "infinite" context alternatives to self-attention are promising for LTM, it'd remain true that the majority of computational power and knowledge resides in the fixed FF weights. If context and weights conflict this can cause degradation during inference. You might have encountered this yourself with GPT4 worsening with search. Lots of research is required to match human learning flexibility and efficiency.

1 comments

> If enough intermediate steps lead to bad results and needed backtracking, but that erasure is not captured

That is a fascinating insight to me. I'm so used to the emacs undo record that I forget that others are not as lucky. I just take for granted that the entire undo history would be available.