It requires the final function to have a numerically reasonable finite difference gradient, which is somewhat different from what is commonly referred to as "differentiable" - eg. the insides of that function could still use non-differentiable/non-analytic functions.
It seems to be based on numeric.js, which is based on the classic Fortran UNCMIN [1] optimizer.
It feels to me like there's also quite a bit of temporal coherence that could be leveraged in order to accelerate the process.