|
|
|
|
|
by jacobross
440 days ago
|
|
Man, this is awesome. I've been obsessed with this idea since reading up on end-to-end RL used in reasoning models and OpenAI using it with Deep Research. Seems like the most powerful agents will make use of some form of RL or advanced learning. I'm not from an ML/DL background but these ideas are fascinating and I've begun self-teaching myself some RL. I'm curious as to how long this took to build and any advice for someone wanting to learn more about RL in this context? Thanks! |
|