Hacker News new | ask | show | jobs
by mapleoin 5237 days ago
Does anyone have a link to a decent comparison between MySQL and PostgreSQL? I'm really wondering why so many people use MySQL, even though it supports a lot fewer SQL features than PostgreSQL.
7 comments

EnterpriseDB (who sell a commercialized version of Postgres) have a couple of MySQL vs Postgres white papers on their site, but hidden behind a registration wall : http://www.enterprisedb.com/resources-community/whitepapers-... .

Robert Haas, a Postgres committer , occasionally blogs about comparisons : http://rhaas.blogspot.com/search/label/mysql

Theres also http://www.wikivs.com/wiki/MySQL_vs_PostgreSQL

Historically , MySQL has been more widely available on low end web hosting plans, so its what a lot of people first use when they start using databases, and a lot of web apps, such as Wordpress, support it exclusively.

Until a year or two ago, only MySQL had built in (if occasionally fragile) replication which made it popular for that reason alone. Postgres now has robust replication, with new features coming down the pipeline soon : http://www.depesz.com/2011/07/26/waiting-for-9-2-cascading-s....

I prefer Postgres, but oddly enough under Oracle theres been some interesting features added to MySQL, which is good for both.

postgres is also not (yet, hopefully?) available in amazon's RDS, probably agin due to the replication topic.
Pure inertia. Over time, a lot of people were using LAMP stack, and they continued to use MySQL for familiarity reasons.

Also, MySQL got commercial entity behind it in its early days which promoted it a lot. In addition, it worked an all platforms, including windows, while Postgres was there just in last couple of years.

I posted this elsewhere in the thread, but I ran some benchmarks on schema changes between the two on 5 million row tables (when deciding to switch) and PostgreSQL completely clobbered MySQL performance wise: https://gist.github.com/1620133. As we recently switched to PostgreSQL, one gotcha is that PostgreSQL uses separate process instead of threads for each connection and so it's slower and more memory intensive to establish new connections so connection pools held application side (such as pg_bouncer) are incredibly important.
Simple reliable replication has been a huge differentiator for a long time; enough so to put up with a lot of the other faults of MySQL. Have not revisited Postgres replication in a long time but I have seen that it has been worked on. Anyone with recent experience in both care to explain how the replication of both stacks up in recent versions?
Simple reliable replication

I cringe every time I read that. MySQL replication is many things, but it is not reliable (as anyone who has used it at scale will confirm).

I think the only reason this myth prevails is because hardly anyone ever actually verifies if their master/slave are in sync. A table checksum can be a real eye-opener here, especially on a deployment that's been running for while and undergone schema changes, restarts, network splits, etc.

<rant>Simple, but not reliable. I've seen admins enable statement-based replication without understanding it, and trash the db. Which is generally my gripe with MySQL: it has some popular features that only work if you don't look at them too closely; starting with support for the SQL standard.</rant>

PostgreSQL's built-in replication is pretty easy to set up[1] and provides a writable master, and a cascade of slaves. Slaves can be synchronous or asynchronous, and the synchronicity can be turned off per transaction.

[1] http://www.depesz.com/2011/01/24/waiting-for-9-1-pg_baseback...

In addition to the other factors listed here, Postgres's default configuration was tuned to a dramatically underpowered machine for many years. Yes, that meant that the occasional user who did have that kind of machine saw acceptable performance out of the box, but the other 9/10 systems burned a lot of the DBA's time tuning the system. I think that's a big reason why Postgres has a reputation for difficulty in some quarters.
For a long time there were no PostgreSQL for Windows. So Windows developers had to use MySQL.

MySQL is good enough so there is no need to migrate applications for a few additional features.

MySQL was very, very easy to get started with, getting to a cruise quickly. PostgreSQL offered more of a curve.

In the Windows world the same is true of SQL Server -- the setup, connectivity, and basic usage is so incredibly easy that it made it the first choice of many teams.

This seems incredible -- that products are chosen on such an irrelevant-in-the-long-term basis -- however it has proven true across almost all of the computing market, even targeting highly skilled developers. PHP has few competitive merits, yet it was the default option for many because it was so easy to make something basic in.

There's a lesson there in that.

This seems incredible -- that products are chosen on such an irrelevant-in-the-long-term basis

I don't think there is anything too incredible in that. If you want to throw together an idea quickly, get it out there and test response then use whatever technology gets the job done quickest. You can always change later.

Why waste huge amounts of time setting up a technically perfect database for a product it turns out no-one wants?

The difference between getting competent with Postgresql versus MySQL was just a few hours. In the scale of a project such a difference dissolves into complete irrelevance, yet it was enough to sway many to use MySQL when it was severely deficient comparatively (though with its adoption it saw love that brought it up to if not beyond parity).

The same is true with many technologies and approaches. Projects that consume thousands or tens of thousands of hours, with a toolset chosen because it represented an outset savings of single-digit hours.

Changing in the future is seldom as easy as it seems in those early days.

The difference between getting competent with Postgresql versus MySQL was just a few hours.

Yes and no. There are a vastly more hosting options that provide MySQL vs Postgres. So it's not an issue of getting competent with the DB system, it's an issue of using an existing LAMP stack or having to roll your own.