Hacker News new | ask | show | jobs
Training a small model to write better OCaml with RLVR and GRPO (blog.nilenso.com)
2 points by sriharis 37 days ago