Hacker News new | ask | show | jobs
by deeviant 4101 days ago
Redshift is like a prison, but with excellent accommodations. It's a great platform but it pretty much the perfect example of vendor lock-in.
5 comments

How is Redshift a vendor-lock in though?

Put your data in S3, in csv/tsv/json format, if you want to switch to other provider, just figure out how to import it, and your are all set. How to figure out the limitation of the different platforms and tuning and optimizing is the difficult part.

Data migration is almost always painful and time-spending. When choosing your data provider, you have to be careful because it is very likely to be a long-term commitment. In that sense, in DW world there is always vendor lock-in. Only it is largely driven by the essence of the application itself, less so by the intention of the provider.

Compared to the lock-in of the AWS ecosystem in general, Redshift honestly isn't that bad. You can unload all of your data into S3 and then do whatever you want with it. I'd be surprised if most data warehousing solutions had such an easy way of exporting the data.
In addition, if you store your data in S3 and have Redshift load it from there then you don't even need to do an export - just leave your source data in S3 after Redshift's loaded it, and you're all ready to switch to another platform.
Can you explain what you mean by that? I fail to see how a PostgreSQL query interface could possibly qualify as a perfect example of vendor lock-in.
If you want to move to another DW platform, it's probably not going to be Postgres-based. As every vendor has a slightly different flavor of SQL with different behaviors, this will require redesigning your queries, schemas, and most if not all of your stored procedures. Depending on the company and age of the platform, this could be many thousands of hours of work.

Really, vendor lock-in is pretty much a given with data warehousing platforms. Though these days, it's not uncommon for large companies to have multiple DW platforms all pulling data from each other. When one platform falls out of favor, the users just migrate themselves to another since most reporting systems not made by SAP or Oracle are compatible with pretty much everything.

In contrast, Vertica, Greenplum, Netezza, Teradata Aster, and CitusDB are all based on PostgreSQL forks. In many cases, the client libraries behave like psql, and ease conversions at that level.

As to SQL language differences, no DW platform uses "standard SQL", just as no two RDBMS use the exact same SQL dialect.

I dare say database platform lock-in is a universal issue. Any migration will involve effort.

To be fair, that kind of comes with the territory when talking about data warehousing. The data volumes are so large that migrating them is usually out of the question, and query languages vary between vendors pretty significantly.
"Resort" is probably a better analogy than "prison." Most people wouldn't choose to leave, since the accommodations are so nice, but for the expense.