|
|
|
|
|
by shri_krishna
1071 days ago
|
|
Which is why knowledge cut off date is important. I prefer if it is frozen to pre-ChatGPT-3.5. Anything post-ChatGPT-3.5 release date should be considered tainted - imagine the sheer number of articles generated by spammers who used ChatGPT. |
|
It's not immediately apparent to people just how much leakage can happen this way. Up to a year ago, I'd probably give people this story[0] to ponder on, but now it's no longer a hypothetical - GPT-3.5 and GPT-4 are clear, practical demonstrations of just how much knowledge is implicitly encoded in what we say or write, and how this knowledge can be teased out of the input data without any prior context, completely unsupervised, given sufficient time and effort (which in silico translates to "sufficient compute", which we already have).
--
[0] - https://www.lesswrong.com/posts/5wMcKNAwB6X4mp9og/that-alien...