User: t55 | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

user: t55
created: 2023-08-18
karma: 896

ML researcher

submissions:

Sokoban Speedrun for RL

6 points | 0 comments

0 points | 0 comments

2 points | 0 comments

Target Policy Optimization

1 points | 0 comments

Show HN: Kilroy – Knowledge base for teams using Claude Code

5 points | 0 comments

Procedural Reasoning Datasets

1 points | 0 comments

In Defence of Gary Marcus

3 points | 0 comments

Reasoning Gym – Procedural RL reasoning datasets

1 points | 0 comments

ChatGPT Agent [video]

3 points | 0 comments

0 points | 0 comments

0 points | 0 comments

0 points | 0 comments

0 points | 0 comments

0 points | 0 comments

0 points | 0 comments

0 points | 0 comments

0 points | 0 comments

0 points | 0 comments

0 points | 0 comments

ReasoningGym: Reasoning Environments for RL with Verifiable Rewards

105 points | 28 comments

Show HN: Rehearsal.so, Duolingo for Public Speaking

3 points | 1 comments

0 points | 0 comments

End-to-End Vision Tokenizer Tuning

3 points | 0 comments

YC Interview Mock Practice

2 points | 0 comments

D1: Scaling Reasoning in Diffusion LLMs via Reinforcement Learning

4 points | 0 comments

Are LLMs more than autocomplete? AI Debate

1 points | 0 comments

Block Diffusion: Interpolating Autoregressive and Diffusion Language Models

72 points | 16 comments

How to stay in flow while using Cursor or Windsurf

2 points | 0 comments