| HN Mirror

1. These models are trained with significant amounts of RL. So I would argue there's not a static "training dataset"; the model's outputs at each stage of the training process feeds back into the released models behavior.

2. It's reasonable to attribute the models actions to it after it has been trained. Saying that a models outputs/actions are not it's own because they are dependent on what is in the training set is like saying your actions are not your own because they are dependent on your genetics and upbringing. When people say "by itself" they mean "without significant direction by the prompter". If the LLM is responding to queries and taking actions on the Internet (and especially because we are not fully capable of robustly training LLMs to exhibit desired behaviors), it matters little that it's behavior would have hypothetically been different had it been trained differently.