Hacker News new | ask | show | jobs
by rwultsch 3552 days ago
The PG storage engine is not particularly awesome. It is basically COW (with exceptions) and compaction (called vacuum) has been quite painful for a long time. Every release it is supposedly fixed, but people keep complaining. This not to say PG sucks, their optimizer knows far more about their data than InnoDB and PG can perform far more types of execution plans.

We (Pinterest, I wrote most of the MySQL automation) make heavy use of MySQL replication which is vastly simpler to manage than PG. All queries still flow through SQL and unlike PG, we can force whatever execution plan we need. We do lots of PK lookups, and InnoDB is really good at that. In InnoDB all the data is stored in the PK while in PG it is just a pointer.

2 comments

>>In InnoDB all the data is stored in the PK while in PG it is just a pointer.

This is just a consequence of the PK being a clustered index in InnoDB which has both pros and cons. One of the big cons is that all of the columns of the PK are implicitly added to every secondary index as the row identifier. That isn't a big problem if your PK is a single column int, but if it's multiple columns, that often results in unnecessary bloat in your secondary indexes. Ideally (as in, dare I say, MS SQL Server), you'd have the option of a clustered or non-clustered PK for your table so you could choose the optimal index structure for your workload on a per-table basis.

If you don't want a clustered index in InnoDB you can define the primary key as an auto incrementing uint.
Yes, you can, but it doesn't change the fact that you still have a clustered index (an index organized table), which is great for PK lookups, but bad if you do a secondary indexes lookup (because you need to lookup through 2 B-trees instead of 1). There is real, and well-known, tradeoff here.
You missed the point of my post. You are going to have one of the two issues, either looking through two index or indexes including the a large PK. At least with InnoDB you can make the choice. The strategy I suggested gets you the desired outcome of not including a large PK in all secondary indexes.
> The strategy I suggested gets you the desired outcome of not including a large PK in all secondary indexes.

For an application in which most queries need a secondary index lookup, using heap organized tables is more efficient because the database needs to traverse only one B-tree (for the secondary index) that gives the physical position of the row in the heap. When using index organized tables, the database needs to traverse 2 B-trees (the secondary index first, then the primary index). Making the primary key short by using an auto incrementing integer helps, but doesn't remove this overhead.

The other part of the tradeoff is that inserts and many other write operations are less expensive in heap tables. A Big Table in InnoDB, measured in "when do I start having to spend a lot of time troubleshooting this table's performance" is about 1% the size of a Big Table in Postgres. TokuDB was introduced for MySQL for a reason.

Heap vs. Index organization is a classic tradeoff of database design.

Now, if you're saying "it would be really nice if Postgres had the option of index-organized tables" I'd agree with you. I'd love to have that, as an option.

> All queries still flow through SQL and unlike PG, we can force whatever execution plan we need. We do lots of PK lookups, and InnoDB is really good at that.

Being able to force the execution plan is more useful in MySQL than PostgreSQL because MySQL's optimizer is not very good at planning queries.

If you do a lot of PK lookups, then you don't need to force the execution plan.