Hacker News new | ask | show | jobs
by pilif 5412 days ago
When I started with PHP3 development in 1998, PostgreSQL was still called Postgres95 and it had some severe limitations which were directly adverse to the features needed for web development:

1) performance was very slow

2) There was an effective row size limit of 8K over the whole row which worked against people who wrote forum software (big posts) or were (ugh) storing binaries in the database.

3) Even today, PostgreSQL requires a bit more maintenance and configuration until it works right, whereas MySQL ran out of the box. Back then, VACUUM wasn't at all optional and still required an exclusive lock over the tables it was working on.

4) There was a misconception that each Postgres user needed to be a Unix system user (this might or might not be true. Probably isn't, but I and others were certainly thinking that), so it was working against the usual cheap virtual hosting.

5) The MySQL extension for PHP was much more advanced than the postgres one, leading to people preferring MySQL which in the end lead to the virtual hosts only installing MySQL at which point Postgres' fate was sealed.

6) MySQL was coined as a spiritual successor to mSQL (just look at the names), which was very popular at the time as it was the first SQL-database for "normal people". You either did flat-files or mSQL.

7) MySQL had Windows support from the beginning. Postgres95 was still using cygwin (if you could get it to work). Back then, many people were using Windows as their development environment and being able to run the whole stack on your development machine does have some advantages.

By now, aside of the maintenance thing, Postgres has lost all these drawbacks, but now it's too late as people are using what they know is working for them, so they are just going with MySQL (or SQLite if they need more simplicity, which is interesting in itself as MySQL for ages provided an embedded linkable and serverless variant which also never took on).

2 comments

#2 - even if you don't store binaries, just very long posts, it's a problem.

#4 - whether it's a misconception or not, the majority of tutorials and even the default utilities make that assumption. Working around that is not something as widely documented as the defaults.

Also, forum software is a big arena. Not being able to get accurate count() values back hurt(s) postgresql. It's hurting people who now rely on MySQL's innodb as well, but you always have the option of MyISAM for certain tables that you want accurate count()s on for pagination. I've heard for years that "it's so easy to do this in PostgreSQL, just write some triggers and stored procedures... " yet... if it's "so easy", why not just bundle that in as default functionality (or perhaps even just a script that would create triggers for you) in postgresql?

I don't ever recall hearing about the "embedded linkable and serverless variant" of MySQL, and have been doing PHP since 1996. Perhaps it was a licensing issue? MySQL seemed to take off largely due to a rise in PHP, but sqlite seems to have taken off more because of embeddability and the public domain aspect of it first, well, before I ever saw PHP adoption of sqlite.

> I don't ever recall hearing about the "embedded linkable and serverless variant" of MySQL, and have been doing PHP since 1996. Perhaps it was a licensing issue?

http://dev.mysql.com/doc/refman/5.1/en/libmysqld.html

pretty much a licensing issue as all of MySQL is released under the GPL or a proprietary license. This also includes the mysql client by the way, but they made a license exception there that allowed linking against PHP (extensions mysql and mysqli) and later there was a reimplementation of the on-the-wire protocol inside a PHP extension (mysqlnd) released under the PHP license.

> Not being able to get accurate count() values back hurt(s) postgresql.

count() is totally accurate within the limits of MVCC. If you need it to be 100% accurate across statements, make your transaction SERIALIZABLE.

There's one thing about count(): count(*) under MySQL, if using MyISAM tables, is optimized so it doesn't have to actually count and thus is much faster than, say count(row), count(whatever) with a where clause or count([asterisk as to not confuse the HN parser.]) in any other database.

But the moment you use InnoDB or any other database that supports transactions, count(whatever) unfortunately requires counting in all cases.

re: count() - I guess I should have also said "reasonably fast". Waiting 8 seconds for a count() isn't practical for many apps.
> why not just bundle that in as default functionality (or perhaps even just a script that would create triggers for you) in postgresql?

Multi-version concurrency control means that a single universal definition of count() is basically insensible.

Under what circumstances is count() inaccurate in Postgres?
See earlier comment - I misstated my point - it's not so much that it's inaccurate as slow. Dog slow. InnoDB in MySQL 'fixes' the speed with estimates, but that's generally not very useful either in most cases (sometimes, it is).

I was confusing my earlier innodb count() experiences (inaccurate estimates) with postgresql count() experiences (dog slow).

> 2) There was an effective row size limit of 8K over the whole row which worked against people who wrote forum software and were (ugh) storing binaries in the database.

Could you not just use the dedicated TEXT or BLOB type for that kind of stuff, and store them out-of-row? (yes, would probably have been even slower)

Nope. There was blob-support, but that relied on special library functions to extract the data to a file which you'd then have to manually read.

Of course this also means that you couldn't search in these fields or do anything else you'd do directly on the database.

TOAST tables were added in 7.1 AFAIK which lifted that limit and allowed for arbitrary sized rows by moving the big fields away into their own storage.

The old functionality is still there (http://www.postgresql.org/docs/9.0/interactive/largeobjects....) though I totally failed and still fail to see any practical use.

> Nope. There was blob-support, but that relied on special library functions to extract the data to a file which you'd then have to manually read.

OK.

> The old functionality is still there (http://www.postgresql.org/docs/9.0/interactive/largeobjects....) though I totally failed and still fail to see any practical use.

Memory constraints maybe? That's about the only justification I could see for using that over bytea: if you store single binary objects in the GB+ range, you may want to interact with them as streams.