Hacker News new | ask | show | jobs
by F_J_H 2333 days ago
I've wondered about this too, but have not gotten around to trying it yet.

We get a gnarly csv log file back from our sensors in the field, which is really a "flattened" relational data model. What I mean by that is a file with "sets" of records of various lengths, all stacked on top of each other. So, if you open it in Excel, (which many users do), the first set of 50 rows may be 10 columns wide, the next 100 rows will be 20 columns wide, the next 45 wide, etc. And, the columns for each of these record sets have different names and data types.

Converting to JSON is obvious, but I've thought about just creating a SQLite file with tables for each of the sets of records. Then, as others have said, can use one of any number to tools to easily query/examine the file. Also can easily import into a pandas data frame.

One concern is file size. Any comments on this? I can try it, but wonder if anyone knows off the top of their heads if a large JSON file converted to an SQLLite file would be a lot larger or smaller?

edit: clarity

1 comments

Yes, it is great for that.

You only have to read the CSV file once, and after that you have a nice set of tables you can query any which way you want.

I use SQLite as an intermediate step between text files and static HTML, for example.