You're right, and the technical distinction is warranted, but the definition you give is still not equivalent to that of being mathematically idempotent, since the domain and range of f are not isomorphic.
That's only because the domain is multi-dimensional, while the range is the domain of just the first argument.
We can fold the input into S. That is to say, the input is just an aspect of the state of the word, and then we can reduce f(S, i) to just f(S), so then we have f(f(S)) = f(S).
We can fold the input into S. That is to say, the input is just an aspect of the state of the word, and then we can reduce f(S, i) to just f(S), so then we have f(f(S)) = f(S).