Hacker News new | ask | show | jobs
by bloodearnest 4626 days ago
I think readlines() is slower in python 3 due to unicode decoding. Python 2 did no decoding, readlines() returns python 2's str type, just bytes. You'd have to decode yourself (or decode the whole file in open()). Python 3 readlines() does decoding into unicode code points by default.

Do a benchmark after the file has been opened with an explicit encoding, to see if I'm right. If the files already been decoded, there shouldn't be as big a difference.