Hacker News new | ask | show | jobs
by sheetjs 2100 days ago
CSV is a very bad example. Yes, it is easy to throw together a simple regex to parse simple RFC4180 CSV strings, but Excel is its own black box with a huge number of hacks.

For example, en-US excel will automagically parse TRUE and "TRUE" to be the logical value TRUE. The way to get Excel to see a literal string TRUE is to make a formula ="TRUE". Many CSV writers implement this hack specifically assuming files will be read back in Excel. So now your parser, if you're trying to process data like Excel, has to do the same.

So then you discover that this is actually localized! If you set your UI language to French (France), Excel will treat VRAI and FAUX as booleans while TRUE and FALSE are treated as literal strings.

What you thought was a simple CSV parser now has to handle localization as well. So that CSV parser library can roll its own dodgy localization support, use a tried and true solution, or just choose not to support the feature. Each choice has its own drawbacks