Hacker News new | ask | show | jobs
by jellyfishbeaver 447 days ago
I work as a data engineer in the financial services industry, and I am still amazed that CSV remains the preferred delivery format for many of our customers. We're talking datasets that cost hundreds of thousands of dollar to subscribe to.

"You have a REST API? Parquet format available? Delivery via S3? Databricks, you say? No thanks, please send us daily files in zipped CSV format on FTP."

2 comments

> REST API

Requires a programmer

> Parquet format

Requires a data engineer

> S3

Requires AWS credentials (api access token and secret key? iam user console login? sso?), AWS SDK, manual text file configuration, custom tooling, etc. I guess with Cyberduck it's easier, but still...

> Databricks

I've never used it but I'm gonna say it's just as proprietary as AWS/S3 but worse.

Anybody with Windows XP can download, extract, and view a zipped CSV file over FTP, with just what comes with Windows. It's familiar, user-friendly, simple to use, portable to any system, compatible with any program. As an almost-normal human being, this is what I want out of computers. Yes the data you have is valuable; why does that mean it should be a pain in the ass?

Yes because users can read the data themselves and don't need a programmer.

Financial users live in Excel. If you stick to one locale (unfortunately it will have to be US) then you are OKish.