|
|
|
|
|
by nickjj
804 days ago
|
|
> Knowing the command line is one of those unsung critical skills of being a software engineer. Yep, the command line is what lets you solve "We have 178 CSV dumps of tables that has ~60 GB of data and we want them imported into a SQL database, there's no previous DB schema info, here's a zip file of questionably named CSVs, can you have this done in 2 days?". Meanwhile there's 8,000+ columns of data that are strings, booleans, datetimes, etc. and some of the files are 15 GB each. It didn't take too much shell scripting to solve that problem in a way that you can run it against a directory of CSV files and have it produce SQL files with table schemas that can be created and then generate the SQL to efficiently import them from a CSV. Basically a little bit of shell scripting and using tools like find, head, sed, grep, wc and friends. It took 4 hours to solve the problem in a way that was testable. |
|