Hacker News new | ask | show | jobs
by jfk13 1883 days ago
I'd be curious how this compares with the encoding_rs crate (https://github.com/hsivonen/encoding_rs), which also offers UTF-8 validation (among many other things).
1 comments

This is SIMD based. If the encoding_rs crate isn't SIMD based, then this thing will beat its pants off.
It does use SIMD (when enabled). But also, the encoding_rs crate is doing more. It isn't just checking for validity. It's also doing transcoding and error detection.
As far as I can see it does not use SIMD for UTF8 validation, only likely/unlikely instrinsics, see https://github.com/hsivonen/encoding_rs/blob/e98a2096ab09c92...
That's not the only use of SIMD in the crate (e.g. see https://github.com/hsivonen/encoding_rs/blob/e98a2096ab09c92...), but I haven't looked into exactly where/how it's used further.