Y
Hacker News
new
|
ask
|
show
|
jobs
Kat-Dev-32B, Kat-Coder with Scalable Agentic RL
(
kwaipilot.github.io
)
1 points
by
robert-zaremba
259 days ago
1 comments
robert-zaremba
259 days ago
KAT-Dev-32B and KAT-Coder are optimized via several stages of training, including a mid-training stage, supervised fine-tuning (SFT) & reinforcement fine-tuning (RFT) stage and an large-scale agentic reinforcement learning (RL) stage.
link