Hacker News new | ask | show | jobs
by g413n 118 days ago
relevant note is that we finetuned by having the human also use arrow keys which keeps it in-distribution but also slower to collect