For LLMs / RLHF it's a little more difficult but https://github.com/huggingface/alignment-handbook and the Zephyr project is a good collection of model / dataset / script that is easy to follow.
I would suggest studying the basics of RL first before diving into LLM RLHF, which is much harder to learn on a single GPU.