| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cybernoodles 527 days ago
	A common practice is to train a transformer model to control a given robot model in simulation by first teleoperating the simulated model with some controller (keyboard, joystick, etc.) to complete the task and create a dataset, and then setting up the simulator to permute the environment variables such as frictions, textures, etc (domain randomization) and run many epochs at faster than real time until a final policy converges. If the right things were randomized and your demonstration examples provided enough variation of information, it should generalize well to the actual hardware.