Hacker News new | ask | show | jobs
by SequoiaHope 1178 days ago
Those statements don't come from the base model, they come from the steering methods, basically a form of hand tuning after the model is mostly trained, which the paper says are relatively crude and imperfect. It is the fact that these models are so unpredictable that led to these attempts at steering.
1 comments

Reminds me of the early Bing where you could talk to it for much longer periods of time. You could get it in a state where it would give some terrible reply and then an upper layer would delete that message and be like "oops, didn't mean to say that"