But the base model, when its trained on the whole internet, will have some extreme biases on topics where there's a large and vocal group on one side and the other side is very silent. So RLHF is the attempt to correct for the biases on the internet.