Hacker News new | ask | show | jobs
by myrmidon 313 days ago
Yes, but those hand-crafted rules are just input data, they don't actually constrain the behavior, they are just an attempt.

Similarly to how verbal instruction works with a child: You can tell it not to touch the hot stove, but the child still might try.

2 comments

> they don't actually constrain the behavior

They do actually constraint the behavior, to various degrees of success which depends on the model, the system prompt, the inference parameters, the current context length and a lot more. Add in the new `developer` role and you have another venue for constraining the assistant outputs. Finally, structured outputs can help in forbidding specific terms too.

You can zap them with RL.