Hacker News new | ask | show | jobs
by fjert 3306 days ago
Wow, this seems simple but I got excited immediately when seeing this. Since I have been writing code to export data for clients, I have grown so tired of Kingsoft Spreadsheets and Google Sheets lagging like crazy with any sizable amounts of data. This will be a cool new tool to show my coworkers tomorrow and I'll be using it. Performance seems very snappy so far!
3 comments

I invested some effort to keep it performant even with fairly large CSV files, including a custom port of some C++ code for fast CSV import. My current favorite example is the Met Museum's 228MB 450k row collection data set; takes about 12 sec. to open in Tad on my 2013 MacBook Pro. Definitely not lag free (and hard to achieve that without going to some serious column store data warehouse like Amazon Redshift), but still reasonable. https://twitter.com/antonycourtney/status/869252722624561152
Thanks for putting this out there!

There are some projects out there using memory mapped files to do fast CSV parsing. Could be a nice way to speed up the memory loading and scroll it in real time. Can't find the link to the library I saw it used in, but it might be an interesting venue to consider. Another library that does it seems to be astropy fast ascii IO module [1].

[1]: http://docs.astropy.org/en/stable/io/ascii/fast_ascii_io.htm...

Try benchmarking OS read() calls vs. either sequential or random reads using memory mapping, whenever I do this OS read() calls end up being quite a bit faster.
Are you familiar with the R package data.table? Its CSV parser is blazing fast. Pandas (the Python tabular data library) also implements a speedy CSV parser. Both are written in C under the hood.
I really don't understand how anyone can use Google office apps, the UI is painfully slow. I was using Gnumeric before, but I'll try this out, thanks OP!
The only two selling points for me with Gnumeric are a rough function equivalence with MS Excel 2003 and the output to LaTeX tables (which is _absolutely awesome_). Other than that, it's always been pretty buggy for me.

Not that LibreOffice gets a pass, but it doesn't crash nearly as often for me.

They have one and only one "killer feature" over various less-sucky apps - they're in browser, so it's much easier to collaborate on a document with random people on random operating systems (including mobile).

But honestly, if you care about collaborating on text and not its formatting, then I'd suggest hosting an Etherpad instance somewhere :).

You should definitely try Delimit (http://www.delimitware.com), I was impressed how easy I could handle >1 mil rows with it.