Hacker News new | ask | show | jobs
by slysf 3179 days ago
I'm sorry but I just can't take MSSQL Server seriously when it cannot export valid CSV. It does not escape commas for CSV or tabs for TSV, and has no option to. If they had their own flat file export that could be re-imported I could forgive it because I could write my own library for that format, but there simply is no way to cleanly export data from MSSQL. I had to write a SQL query that was ~2000 lines of 80 columns in order to export a database to CSV and properly escape each field manually, and it took FOREVER.
6 comments

Not just exporting though. I've seen automated systems that attempt to import csv files directly through SQL Server, and it would always break at quoted fields and fields with newlines. And nobody could figure out why it broke all the time, and fields would be out of order or shifted or missing.

I wasn't able to convince people to fix it but I ended up writing a Go utility to reliably import/export csv into sql server using Bulk Insert for my own use (and sanity). And it ended up being faster than other methods to boot.

> I'm sorry but I just can't take MSSQL Server seriously when it cannot export valid CSV.

“Valid CSV” is a dubious phrase, since the closest thing CSV has to a spec is an RFC that tried to map out the space of the wide variety of inolementations then existing.

Anyhow, SQL Server is a database server; there are a wide variety of ETL tools that will export from the server to any common (or not, really) format you like, including just about any flavor of CSV/TSC you might be interested in.

If you can't write data to a file, and then read it back in, it's not valid.
There is no such thing as “Valid CSV”. It’s an ambiguous format, with dozens of variants.

SQL server supports CSV with exactly the same senantics as Excel. Which is what people expect 99.9% of the time, because most CSV data goes to or from Excel in the real world.

If you’re DB-to-DB imports and exports, use a sane file format with a sane delimiter such as the Unicode INFORMATION SEPARATOR and RECORD SEPARATOR characters which were inherited from ASCII.

Not sure if you're trolling here but... have you tried bcp?

"The bulk copy program utility (bcp) bulk copies data between an instance of Microsoft SQL Server and a data file in a user-specified format. The bcp utility can be used to import large numbers of new rows into SQL Server tables or to export data out of tables into data files."

https://docs.microsoft.com/en-us/sql/tools/bcp-utility

Thank you, yes I did try it, bcp does not escape commas or newlines in a field.
Compare that with creating a table right from csv file (using fdw) in postgres without importing data. Very neat
Polybase does this for SQL Server and is the recommended way to.load data into Azure SQL DW.
Looks good, setup is quite a hassle. but still not as straightforward as postgres file fdw
A missing small feature here or there is kind of a lame reason to not take a large and well-respected product seriously. That said, bcp and PowerShell (piping the output of Invoke-Sqlcmd through Export-CSV) are among the more common ways that can be done with SQL Server.
If they can't get a fundamentally simple feature right, why would I give them the benefit of the doubt that other features are thought out? bcp does _not_ escape commas or newlines in a field, you have to write sql to replace() those characters on select.
Exporting data incorrectly isn't a missing small feature. It's data corruption.