|
|
|
|
|
by sheetjs
4269 days ago
|
|
> how are the strings in excel encoded anyway? Length-prefixed byte arrays encoded using various code pages. There are a small number that excel uses: https://github.com/SheetJS/js-codepage/blob/master/excel.csv (the columns are CP#, mapping, single/double-byte) > Does the language this library is written in support that translation? Are there modules to do that? Is the license for those module(s) necessary compatible? If we can put together an Apache2-licensed module in JS in an afternoon (https://github.com/SheetJS/js-codepage) it can be done in python. > Who's going to go through the different document versions to confirm, and adjust for the various encodings for non-ascii characters? Someone already did that: https://github.com/SheetJS/test_files/tree/master/biff5 has artifacts for every language type |
|
I thought Python 2 was Unicode-unfriendly. So not as easy as JS.