Hacker News new | ask | show | jobs
by keithwarren 4226 days ago
There is so much here, it is so verbose...so religious.

There is the appearance of facts, but mostly it is bias by omission. Take for example the CSV complaints..."MS SQL Server can neither import nor export CSV. Most people don't believe me when I tell them this". This is just false, patently and obviously false. BCP and SSIS, both part of SQL Server have excellent support for importing and exporting flat files. I did keyword searches through the whole post, no instances of BCP, SSIS or Integration Services (the long term for SSIS).

I am sure others will break this thing apart section for section but I wish people would be fair and open about stuff like this - I mean, seriously if you are going to put the time and effort into a long document like this, don't be so blatantly bias.

3 comments

I've done a fair amount of work importing CSV files into SQL Server 2008. I have no idea if this is still broken, but I can confirm that 2008 mishandles CSV files where the character used to surround string columns appears escaped within the column itself. Such as:

57,4.3209,"green","He calls himself""Waldo""",98,"Y"

This is a perfectly valid CSV file, but SSIS will choke. I ended up writing a custom script component to load these files anyways.

That said, I absolutely love SQL Server, and I think this guy is a little over the top in his religious hatred of it. In an Enterprise setting, it works really, really well.

They likely should have said "standards compliant CSV." I've worked a lot with CSV (too much!) and let me tell you Microsoft's tools are simply awful at following the CSV standard.

Excel in particular drives me up the will. Plus the automatic inferences in Excel break all kind of things. Have a six digit number? Randomly a date. Have a UPC? Going to corrupt it in numerous ways (leading zero stripping, converting to scientific notation, etc).

I can definitely sympathize with anyone who criticises Microsoft's CSV support. They really just want you to use an Excel format (either the binary format or the zip-XML one). However Excel formats are overly complicated for the type of data migration work CSV is often utilised for (XML is too large, it adds up, and requires custom code to import/export (even if that is XSL from CSV), JSON might be better but isn't "yet" popular).

I completely agree with you on Excel. It's gotten bad enough that I'm starting to toy with the idea of replacing Excel with a CSV viewer in my workflow (since all that I use Excel for is to view small-ish CSVs, anyway).
I've gotten into using node.js with the mssql module for handling csv or xml in a stream... it does work really well, and the compatibility is pretty smooth.

I wish that line-item JSON were more popular, it's more expressive than csv, the tooling (programatic) seems to be a bit better, and much less verbose than equivalent XML, which I always had a problem with, as mapping XML to/from an object model is always more painful by comparison.

line-item utf-8 json+gz work really well for import/export data...

> let me tell you Microsoft's tools are simply awful at following the CSV standard.

There's no such thing. CSV is a convention that people mostly agree on but there's no written CSV standard and no CSV standards body.

RFC 4180 is close enough for me.
what gets me is scientific notation. Sure, I'd love for you not ask me and convert all my large numbers into scientific notation.... and loose digits
> I mean, seriously if you are going to put the time and effort into a long document like this, don't be so blatantly bias.

It's also totally counterproductive. If you tell me to use something and you start telling me things that aren't true I'm going to be deeply skeptical of your possibly-valid points if I know you're either ignorant or dishonest in some cases.

this is, well, particularly apropos for keithwarren -- I had no end of pain getting csv into sql server; pg handles it fine
anecdotal but I have not had such issues, but being fair I have mostly not had to deal with non-english scenarios.

It is one thing to say 'SQL Server takes some effort to import CSV files with X scenarios'; but to say it does not import them at all?