| If you're doing user-supplied CSVs, definitely... but if you are ingesting CSVs from a known source with known format (<insert audible sigh here>) it can definitely make sense to use a high-speed optimized ingester. One might wonder if it might be worth the time to look into optimising the runtimes of various languages. I took a look, all operate on naive byte-by-byte scanning, and all sans PHP are written in the respective language which means any form of SIMD optimization is right off the table (okay, maybe something could be done in Java, but it seems incredibly complex, see https://www.morling.dev/blog/fizzbuzz-simd-style/): - PHP isn't optimized anywhere, but at least it's C: https://github.com/php/php-src/blob/1c0e613cf1a24cdc159861e4... - Python's C implementation is the same: https://github.com/python/cpython/blob/main/Modules/_csv.c - Java doesn't have a "standard" way at all (https://www.baeldung.com/java-csv-file-array), and OpenCSV seems the usual object-oriented hell (https://sourceforge.net/p/opencsv/source/ci/master/tree/src/...). - Ruby's CSV is native Ruby: https://github.com/ruby/ruby/blob/bd65757f394255ceeb2c958e87... |