Hacker News new | ask | show | jobs
user: starzmustdie
created: 2022-03-18
karma: 7

submissions:

0 points | 0 comments
Show HN: #1 On This Day
18 points | 1 comments
A minimal hackable implementation of policy gradients (GRPO, PPO, REINFORCE)
1 points | 0 comments
0 points | 0 comments
Reasoning Gym: Procedural Dataset Generation for Reinforcement Learning
1 points | 0 comments
0 points | 0 comments
Show HN: Word Game Bench – evaluating language models on word puzzles
1 points | 0 comments
Show HN: Answers to Chip Huyen's ML Interview Questions
3 points | 0 comments