|
|
|
|
|
by sheetjs
1753 days ago
|
|
Pre-Unicode issues still haunt us today, kept alive by various file formats that rely on system encoding. Under the Apple "Mac-Roman" encoding [1], the standard MacOS encoding before OSX switched to Unicode, byte 0xBD currently is capital omega (U+03A9 Ω). However, in the original 1994 release of the character set, they erroneously mapped to the ohm sign (U+2126 Ω) Apple eventually fixed this in 1997, as noted in the changelog: # n04 1997-Dec-01 Update to match internal utom<n3>, ufrm<n22>:
# Change standard mapping for 0xBD from U+2126
# to its canonical decomposition, U+03A9.
However, in 1996, Microsoft copied over the mac encoding to CP10000 using the incorrect character [2]. Unfortunately the codepage was not corrected when Apple realized their mistake.This discrepancy leads to a huge number of strange issues with various versions of Excel for Mac (BOM-less CSV, SYLK and other plaintext formats default to system encoding) and other software that use Microsoft's interpretation of Apple's Mac-Roman encoding rather than Apple's official character set mapping. [1] http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/ROMAN.T... [2] http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/MAC/RO... |
|