|
|
|
|
|
by masklinn
2244 days ago
|
|
> Indeed. This is a pretty damning difference. The `target` string is being repeatedly UTF-8 decoded where as the same is not true in the Go version. The Go version even goes out of its way to do UTF-8 decoding exactly once for each of `source` and `target`, but then doesn't do the same for the Rust program. I'm really not sure that's an issue, utf8 decoding is very, very cheap and it's iterating either way. It would have to be benched, but I wouldn't be surprised if allocating the caches (at least one allocation per line of input) had way more overhead, especially given the inputs are so very short. I'm not going to claim Rust's utf8 decoder is the fastest around, but it's very fast. |
|
But yes, I did benchmark this, even after reusing allocations, and I can't tell a difference. The benchmark is fairly noisy.
I agree with your conclusion, especially after looking at the input[1]. The strings are so small that the overhead of caching the UTF-8 decoding is probably comparable to the cost of doing UTF-8 decoding.
[1] - https://github.com/christianscott/levenshtein-distance-bench...