Hacker News new | ask | show | jobs
by bsg75 5133 days ago
I am genuinely interested in why schema changes in RDBMS' are such a point of contention. Is it the addition of new columns that are problematic, or the modification of existing column datatypes?

Having written more than one mechanism to automate schema upgrades in remote deployments, and numerous migration scripts, I know it is an extra step in the development process, but I have not considered it an onerous one.

3 comments

Try ALTER TABLE ADD COLUMN on a 100 million rows table :).

Jeremy Zawodny from Craigslist explained why it helped them a lot for Craigslist archives database, where an alter table could take up to 24 hours: http://www.10gen.com/presentations/mongosf2011/craigslist

OK, a MySQL problem. Not so much with PostgreSQL, where adding new columns noes not need to touch every table row.

NoSQL ~= NoMySQL

For me, the lack of schema also means that during an upgrade on a large database I can upgrade my app, start batch updating the data in the database and then also perform on-access upgrades and minimize my app downtime. Changing a schema in a large database can potentially require a lot of downtime or other interesting gymnastics to keep downtime to a minimum. I'm not completely sold on Mongo but this is one area where not having a schema can really help.
Would not any online schema update in a more complicated cases (not just simply adding or removing columns) require interesting gymnastics or downtime in MongoDB too? The application needs to support both the old and new (and possible an intermediate) schemas during the online upgrade.

PostgreSQL supports transactional schema modifications. fast adding and removing of columns, and lockless index creation. So in simple cases upgrading the schema is trivial. For complicated cases it can be a mess, but my guess is that that applies to any database.

Schema changes are exclusively an application-layer change in MongoDB. So MongoDB can support as many schemas as you want at the same time whilst you are migrating.

Also don't forget that MongoDB can have arrays and sets as a "column" type. Which if you tried to replicate in RDBMS would mean a multi-table migration.

> Schema changes are exclusively an application-layer change in MongoDB.

Which is not by itself an advantage. You still need to write the code which does the schema change if you for example rename a field.

> Also don't forget that MongoDB can have arrays and sets as a "column" type. Which if you tried to replicate in RDBMS would mean a multi-table migration.

So can PostgreSQL. The sets are not as general as in MongoDB though since they can only store strings.

Simple. To add a new attribute with MongoDB I simply add a new variable in my Java model and that's it. Done.

With RDBMS. I have to write an update SQL script, rollback SQL script, run it against my dev environment, ensure it is applied through test environments and hope the DBA doesn't forgot to run it in production (it happens).

I use an RDBMS with Django. When I want to add a new attribute to a schema, I just add the variable to my model, and South picks up on the change and writes the update script for me.