|
|
|
|
|
by brabel
1708 days ago
|
|
> it’s using Unicode-aware string stuff Rust uses UTF-8 internally for Strings, so it's very efficient to parse a file into a String, then using slices to go through it... this is probably the best you can get as parsing ASCII input as UTF-8 is very efficient (the 0-bit is always zero in ASCII, the unicode decoder only needs to check that's the case for every byte, so it's not some kind of complicated computation it's doing to decode)... If you use bytes for everything, you will make the whole code much harder to follow and it still won't run faster. Check for yourself: https://github.com/renatoathaydes/prechelt-phone-number-enco... |
|
The code will be easier to follow if you use bytes throughout, because currently it’s a mixture of bytewise and charwise operations, so that you need to think a little about whether you’re dealing with char or u8 in each place (and half of them are even mislabelled); and there are suitable alternative ASCII methods for every place that uses charwise methods (e.g. char.is_digit(10) → u8.is_ascii_digit()) so that no extra burden is added. In the end the only place slightly complicated by it is printing the solutions, but more code will have been decomplicated—hotter path code, too—so that it’s easily worth it.
I don’t know where the code you’re citing came from, it’s newer than what’s on the master branch but in its changes includes some pretty bad stuff like DIGITS, using &str for something that is always a single-character ASCII digit, accessed by already having had the digit as a u8 and turning it back into a string prematurely. Admittedly the optimiser is going to fix a fair bit of that badness, but not all.