Hacker News new | ask | show | jobs
by d4rkp4ttern 1180 days ago
Everyone seems to assume that all the “tricks” behind training ChatGPT are known. The only clues are in papers from ClosedAI like the InstructGPT paper. So we assume there is Supervised Fine Tuning, then Reward Modeling and finally RLHF.

But there are most likely other tricks that ClosedAI has not published. These probably took years of R&D to come up with, others trying to replicate ChatGPT would need to come up with these tricks on their own.

Also curiously the app was released in late 2022 while the knowledge cutoff is 2021 — I was curious why that might be, and one hypothesis I had was that it may have been because they wanted to keep the training data fixed while they iterated on numerous methods, hyperparameter tuning etc. All of these are unfortunately a defensive moat that ClosedAI has.