|
|
|
|
|
by jellyfishbeaver
447 days ago
|
|
I work as a data engineer in the financial services industry, and I am still amazed that CSV remains the preferred delivery format for many of our customers. We're talking datasets that cost hundreds of thousands of dollar to subscribe to. "You have a REST API? Parquet format available? Delivery via S3? Databricks, you say? No thanks, please send us daily files in zipped CSV format on FTP." |
|
Requires a programmer
> Parquet format
Requires a data engineer
> S3
Requires AWS credentials (api access token and secret key? iam user console login? sso?), AWS SDK, manual text file configuration, custom tooling, etc. I guess with Cyberduck it's easier, but still...
> Databricks
I've never used it but I'm gonna say it's just as proprietary as AWS/S3 but worse.
Anybody with Windows XP can download, extract, and view a zipped CSV file over FTP, with just what comes with Windows. It's familiar, user-friendly, simple to use, portable to any system, compatible with any program. As an almost-normal human being, this is what I want out of computers. Yes the data you have is valuable; why does that mean it should be a pain in the ass?