Hacker News new | ask | show | jobs
user: ag8
created: 2019-07-03
karma: 2006

runrl.com

submissions:

0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
Gourmand Syndrome
27 points | 9 comments
0 points | 0 comments
guys why does armenian completely break Claude
99 points | 65 comments
Sampling at negative temperature
203 points | 60 comments
Perfectly Replicating Coca Cola [video]
1 points | 1 comments
0 points | 0 comments
0 points | 0 comments
Po.ta.to
4 points | 2 comments
Scaling pretraining affects RL sample efficiency
1 points | 0 comments
Systematically generating tests that would have caught Anthropic's top‑K bug
2 points | 0 comments
0 points | 0 comments
Tinker
4 points | 2 comments
Training Qwen to answer briefly yet intelligently using feedback control
4 points | 0 comments
0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
0 points | 0 comments
Launch HN: RunRL (YC X25) – Reinforcement learning as a service
71 points | 22 comments
Generating the Funniest Joke with RL
1 points | 0 comments
0 points | 0 comments