It's bread and butter math for physics, Engineering (trad. Engineering), Geophysics, Signal processing etc.
Why would anyone have people implementing Kalman filters who found the math behind them "esoteric"?
Back in the day, in my wet behind the ears phase, my first time implementing a Kalman Filter from scratch, the application was to perform magnetic heading normalisation for on mag data from an airborne geophysical survey - 3 axis nanotesla sensor inputs on each wing and tail boom requiring a per survey calibration pattern to normalise the readings over a fixed location regardless of heading.
This was buried as part of a suite requiring calculation of the geomagnetic reference field (a big paramaterised spherical harmonic equation), upward, downward and reduce to pole continuations of magnetic field equations, raw GPS post processing corrections, etc.
where "etc" goes on for a shelf full of books with a dense chunk of applied mathematics
FWIW, I think I understand Kalman filters quite well, but the linked PDF is hard for me to follow, and I'd really struggle to understand it if I didn't already know what it's saying.
I think the lesson there is that the Kalman filter is simpler in the "information form" where the Gaussian distribution is parameterized using the inverse of the covariance matrix.
If you don't already know what that means, you likely don't get much out of that. I think the more intuitive way is to first understand the 1D case where the filter result is weighted average of the prediction and the observation where the weights are the multiplicative inverses of the respective variances (the less uncertainty/"inprecision", the more you give weight).
In the multidimensional case the inverse is the matrix inverse but the logic is the same.
More generally the idea is to statistically predict the next step from the previous and then balance out the prediction and the noisy observation based on the confidence you have in each. This intuition covers all Bayesian filters. The Kalman filter is a special case of the Bayesian filter where the prediction is linear and all uncertainties are Gaussian, although it was understood this way only well after Kalman invented the eponymous filter.
Not sure how intuitive that's either, but don't be too worried if these things aren't obvious, because they aren't until you know all the previous steps. To implement or use a Kalman filter you don't really need this statistical understanding.
If you prefer to understand things more "procedually", check out the particle filter. It's conceptually the Bayesian filter but doesn't require the mathematical analysis. That's the way I really understood the underlying logic.
I understood it as reestimation with a dynamic weight factor based on the perceived error factor. I know it’s more complex than that but this simplified version I needed at one point and it worked.
Why would anyone have people implementing Kalman filters who found the math behind them "esoteric"?
Back in the day, in my wet behind the ears phase, my first time implementing a Kalman Filter from scratch, the application was to perform magnetic heading normalisation for on mag data from an airborne geophysical survey - 3 axis nanotesla sensor inputs on each wing and tail boom requiring a per survey calibration pattern to normalise the readings over a fixed location regardless of heading.
This was buried as part of a suite requiring calculation of the geomagnetic reference field (a big paramaterised spherical harmonic equation), upward, downward and reduce to pole continuations of magnetic field equations, raw GPS post processing corrections, etc.
where "etc" goes on for a shelf full of books with a dense chunk of applied mathematics