Hacker News new | ask | show | jobs
by tehsauce 1117 days ago
This is very cool despite the most important caveat:

“Note that we do not directly compare with prior methods that take Minecraft screen pixels as input and output low-level controls [54–56]. It would not be an apple-to-apple comparison, because we rely on the high-level Mineflayer [53] API to control the agent. Our work’s focus is on pushing the limits of GPT-4 for lifelong embodied agent learning, rather than solving the 3D perception or sensorimotor control problems. VOYAGER is orthogonal and can be combined with gradient-based approaches like VPT [8] as long as the controller provides a code API.”