Most published libm on GPU/NPU side have a few ULP of error for the perf vs accuracy tradeoff. Eg, documented explicitly in the CUDA programming guide: https://docs.nvidia.com/cuda/cuda-programming-guide/05-appen... .
Prof. Zimmermann and collaborators have a great table at https://members.loria.fr/PZimmermann/papers/accuracy.pdf (Feb 2026) comparing various libm wrt accuracy.