It puts every comment into that GPT output detector and colors and writes a short comment on the HN comment, like you see in the screenshot based on a threshold. >0.7 is probably AI, >0.9 is definitely AI. Lower than that is most likely human. Most comments still appear to be human.
It only becomes reliable after about 50 tokens (one token is around 4 characters) so I mark the comments that are too short with gray and make no assessment on those.
I see. My question of how it works was more about the method you were using to identify content as something written by an AI, but from what I saw on your repo, you rely on a set of GPT-specific configurations that identify a percentage of similarity to the content being analyzed?
Like desrcibed in the repo, I just feed it to the GPT output detector. I didn't write that tool, but from my understanding they trained a GPT model to recognize itself.
Okay cool, I had heard the same kind of AI training during the release of DALL.E 2 where one of the AI was dedicated to the generation of the image and another AI which checked if the generated image corresponded to an AI or not.
It only becomes reliable after about 50 tokens (one token is around 4 characters) so I mark the comments that are too short with gray and make no assessment on those.
I've put it on https://github.com/chryzsh/GPTCommentDetector