| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jacobross 486 days ago

Man, this is awesome. I've been obsessed with this idea since reading up on end-to-end RL used in reasoning models and OpenAI using it with Deep Research.

Seems like the most powerful agents will make use of some form of RL or advanced learning.

I'm not from an ML/DL background but these ideas are fascinating and I've begun self-teaching myself some RL.

I'm curious as to how long this took to build and any advice for someone wanting to learn more about RL in this context?

Thanks!

1 comments

lukasego 486 days ago

The entirety of the production-ready platform took us 3-4 weeks to build, including figuring out RL and GPU infrastructure. If you want to know more about RL, you can check out Huggingface. You can also hop on Augento https://augento.ai and join our Slack community. We'll answer and discuss any question together and with others. You'd get $250 worth of free credits you can use to tinker with RL already - it'll teach you some stuff.

jacobross 486 days ago

That’s impressive. I have an extremely long chat with Claude that I did about a month ago discussing an idea very similar to this. Obviously an idea is worth next to nothing compared to what you and the team have created here but it’s becoming a genuine obsession of mine. Will Brown’s talk recently on RL ignited this even further given what he explained.

I’ll jump in this weekend.

Part of me wishes I did CS instead of learning SWE. There’s so much to uncover in RL and jumping straight in at the top feels like the wrong strategy to learn effectively.

I love the idea, love the platform. I’ll be keeping a close eye on how you guys go.

If you need a Technical Product Manager, let me know! I’m currently an Artificial Intelligence Lead at a hardware-enabled SaaS company but genuinely believe RL and agents will be the next step towards AGI.