Hacker News new | ask | show | jobs
by aifer4 1046 days ago
You're right, that iterative methods may be linear in time with respect to the number of parameters. In this paper, we provide a method which is linear in time with respect to the dimension (d) of the vector (x) we are solving for in the equation A x = b. The number of parameters is d^2 + d, as there are d^2 parameters for the matrix A and d for the vector b. Gradient descent would require a matrix-vector multiplication just to compute the gradient, which is already O(d^2).

You make an important point, regarding the compilation. In this case, we are talking about the time it takes to upload the matrix A and vector b to the hardware. This requires O(d^2) numbers to be updated, but assuming it is done in parallel it could be done in O(d) time, and the coefficient of this scaling is independent of the physical parameters of the analog hardware. For this reason, in the analysis of the algorithm, we are generally ignoring the time to update the parameters, as is clarified in the Methods section.

1 comments

OK, got it. I think what you're describing is Gradient Descent on the Normal Equations to solve an overdetermined linear system. Indeed in such a system dim(x) == dim(b) == d. Matrix A is fixed though and not part of the estimation but you're correct about the complexity of gradient computation which is indeed O(d^2).

Thanks for the clarification of the uploading / compilation step.