Hacker News new | ask | show | jobs
by twotwotwo 3581 days ago
As someone notes on the bug, if you were rolling your own, there are some other things you could do--return a [][]byte that's a pointer to its internal buffer, only usable until the next row is read.

Making a version of encoding/csv that retains most of its features (custom delimiters, handling backslashes and quoting and \r) but streams like that would be a fun open source project for someone who likes Making Things Go Fast.

2 comments

I did this a couple of months ago and got a >5x speedup. It's at the expense of dropping quoting though, so no commas or newlines can be in the input data.

https://github.com/pwaller/usv

That guy here. I'm also interested in rolling a version that only supports standard delimiters so I can forego rune parsing. Rune parsing accounts for about 30% of the processing; not sure how much a bytes implementation could save, but I'm hopeful.