| With PyO3, I built the library to parse datetimes 10x faster than `datetime.strptime` in just a few lines of code: https://github.com/gukoff/dtparse It just calls the Rust's chrono library that does the parsing and wraps the result in a Python object. You can do it for any Rust library, it's very, very easy! The only slightly complicated part is the distribution. You need to use https://github.com/PyO3/maturin or https://github.com/PyO3/setuptools-rust, and of course, you need to have Rust installed on the wheel-building machine. Feel free to use this repo as a reference if you want to build a similar thing. The code is commented, and there's a working GitHub action that builds the wheels for all platforms and uploads them to PyPi: https://github.com/gukoff/dtparse/tree/master/.github/workfl... |
I ended up looking at a bunch of different ways of processing timestamps in Python: strptime(), string parsing, regex, datetime.isoformat(), NumPy, Pandas, and more. I got a 46x speedup using datetime.isoformat(). Other approaches got anywhere from 4x to 40x speedup, and a couple approaches were an order of magnitude slower than strptime().
My takeaway was there's no substitute for profiling the actual code you're running, and focusing on the specific bottlenecks in your own project. I wrote this up in a blog post if anyone's interested, "What's faster than strptime()?"
https://ehmatthes.com/blog/faster_than_strptime/