"In a few rigged demos, it even lies in more serious ways, like hiding evidence that it failed on a task, in order to get better ratings."