Check out this video https://www.youtube.com/watch?v=HoKDTa5jHvg of the math.
A lot of other sources skip over the math to give you the big picture.
After that you will find the video very helpful in understanding the math.
The meta tl;dr is that diffusion is generally used for modern AI-image generation (with a lot of developments in that space), but the code workflows/Colab Notebooks are a mess and filled with technical debt, so a Hugging Face high-level approach for the tool makes sense.