Hacker News new | ask | show | jobs
by sytelus 2436 days ago
Many caviates but impressive progress in manipulation, especially sim2real:

- Only 20% attempts successful on hardest configs with 26+ moves

- Solving steps are not generated by RL (but could be[1])

- Cube is modified internally to transmit additional state via bluetooth

- Highly calibrated and fine tuned environment+MuJoCo based sim to match simulation to reality as much as possible

- Open AI Five algorithm is pretty much reused as-is

- Cumulative training time = 13 thousand years, same order of magnitude as the 40 thousand years

- 32+64 V100 GPUs per training cycle

[1] https://arxiv.org/abs/1805.07470