Poly1305 is straightforward to implement and also scales with the width of SIMD vectors, just like other polynomial-based MACs. BLAKE3 is nice but tricky to implement and optimize; a textbook implementation performs very poorly.
Avoid using Rust implementations for serious benchmarks, especially in the context of symmetric cryptography.