Hacker News new | ask | show | jobs
by dzdt 1939 days ago
I got some code related to VICE emulator. It looked pretty real, referring to concepts that make sense in the context of a C64 emulator, but the results said it was GPT not real code. It even had the correct GPL license matching that project. It seems the GPT model has learned quite a bit about the real projects it was fed as input.
1 comments

It has entirely memorized a bunch of common open source licenses, a bunch of contributor names/emails, and so on. However when I've tried to locate the actual code it's producing in the training data it's not there.