Hacker News new | ask | show | jobs
by sheetjs 1420 days ago
Unfortunately the source spreadsheet was not shared, but the root cause was a misreported codepage (which you can actually fix with a hex editor!)

"Â ", where the second character is a unicode non-breaking space characters (U+00A0), is the CP1252 encoding of the byte sequence [0xC2, 0xA0] which is the UTF-8 encoding of "\u00A0". So all that really needs to happen is to get Excel to think that the file is UTF-8 encoded.

The fix? For XLS, there is a special Codepage record that indicates how strings are to be interpreted. The magic value to force UTF-8 interpretation is 65001, and it is possible to use a hex editor to find exactly where that record is located and change it.