Hacker News new | ask | show | jobs
by throwawayadvsec 1072 days ago
Maybe it's some remnants from RLHF?

A lot of these look like what I'd put in a training dataset, not something I'd pay money to generate, IMO it's unlikely to be responses for other people.

It kinda looks like what GPT-2 or small current models used to output when they got "lost"

What prompts did you use? Did you use some kind of unusual syntax that could make it bug?

1 comments

Hmm, that could be. It happens with different prompts when the number of input tokens is over a certain length.

Fascinating how hard these kind of issues are to debug.

If it's about the length it maybe due to optimizations for the large context length