As explained in the Wikipedia article, we use the Newton–Raphson method method with O(log n) steps to find the reciprocal of the divisor, then we multiply this reciprocal by the dividend. Each step of the reciprocal method reqires a single multiplication, with the precision doubling at each step. So the steps take M(1), M(2), M(4), ..., M(n/2), M(n) time. Since multiplication isn't sublinear, M(cn) ≥ c·M(n) for c ≥ 1 and "most" n (very roughly); thus, M(n) + M(n/2) + M(n/4) + ... ≤ M(n) + 1/2·M(n) + 1/4·M(n) + ... ≤ 2·M(n). Adding the final multiplication, we get an upper bound of 3·M(n), or O(M(n)).
See the book referenced. Basically, if you implement division by Newton-Raphson, using multiplication as a subroutine, the complexity is O(M(n)).
In the other direction, one can use division to do multiplication by: (1) doing reciprocal with division, (2) doing squaring with reciprocal, and (3) doing multiplication with squaring. All these operations (division, multiplication, reciprocal, and squaring) turn out to be equivalent in complexity up to a constant factor. All this is asymptotically in n, the number of bits.