Training a small model to write better OCaml with RLVR and GRPO

Y	Hacker News new \| ask \| show \| jobs

	Training a small model to write better OCaml with RLVR and GRPO (blog.nilenso.com)
	2 points by sriharis 37 days ago