Looks like it was inspired by http://keyj.emphy.de/minimp3/ and https://keyj.emphy.de/kjmp2/ , always good to see more people trying to implement multimedia codecs. The theory behind it all is not easy to understand, but in practice it turns into a bunch of arithmetic and table lookups.
That said, this one being floating-point and requiring intrinsics makes it less portable.
You absolutely correct, it's indeed inspired by Keyj`s minimp3. Float-point used because it's hard do with 4-byte fixed-point type and achieve ISO conformance. We must to use different dynamic ranges for different parts of decoder (i.e. emulate own floating point) or more than 4-byte fixed-point type. Only float-point support is needed, SSE/NEON intrinsics is not required and can be fully disabled by MINIMP3_NO_SIMD.
It looks like it's fast because it makes use of x86 and arm SIMD extensions, but you could probably get the scalar version to run on an arduino with some effort. It would likely be slow, though.
If you are looking at arduino audio applications, I would suggest either using a MP3 codec asic or pre-decoding to wav and compensating by using larger storage.
Encoders usually harder, because, for example, you can't verify it using reference vectors (there no exact reference to compare with). Also encoders like h264 contains big part of decoder as well, because it must reconstruct encoded frame internally for motion compensation.
That said, this one being floating-point and requiring intrinsics makes it less portable.