The reason why the arsinh transformation is useful (and this is not mentioned in the link you posted) is that it is the optimal variance-stabilizing transformation [1] under the assumption that your data is contaminated by a mixture of additive and multiplicative noise (the same way that the log transformation is the optimal variance-stabilizing transformation when your data is contaminated only by multiplicative noise).
Read the Wikipedia article for a more formal explanation.
Is taking logs (or arcsinh or whatever) really all that good an idea if (a) you don't have a good physical model justifying it or (b) your data spans several orders of magnitude?
It makes nonlinear relationships linear. Makes the model less sensitive, too. For instance if the data spans several OoM, adding or removing one datapoint in one of those orders can generate a lot of skew before the log-linearization.
It's easy to cast the log back to the original distribution by taking the exponent afterwards.
As far as I understand directly transforming your data can lead to problems. In any case, its what link functions do better in generalized linear models[1].
Read the Wikipedia article for a more formal explanation.
[1] https://en.m.wikipedia.org/wiki/Variance-stabilizing_transfo...