Hacker News new | ask | show | jobs
by Araq 4678 days ago
Strings should be converted to UTF-8 as part of input validation. For proper input validation we have the taint mode already. Note that often a file does not include any information about the encoding, so you can only use heuristics.