Y
Hacker News
new
|
ask
|
show
|
jobs
user:
starzmustdie
created:
2022-03-18
karma:
7
submissions:
0 points
|
0 comments
Show HN: #1 On This Day
18 points
|
1 comments
A minimal hackable implementation of policy gradients (GRPO, PPO, REINFORCE)
1 points
|
0 comments
0 points
|
0 comments
Reasoning Gym: Procedural Dataset Generation for Reinforcement Learning
1 points
|
0 comments
0 points
|
0 comments
Show HN: Word Game Bench – evaluating language models on word puzzles
1 points
|
0 comments
Show HN: Answers to Chip Huyen's ML Interview Questions
3 points
|
0 comments