Hacker News new | ask | show | jobs
by dhd415 3554 days ago
>>In InnoDB all the data is stored in the PK while in PG it is just a pointer.

This is just a consequence of the PK being a clustered index in InnoDB which has both pros and cons. One of the big cons is that all of the columns of the PK are implicitly added to every secondary index as the row identifier. That isn't a big problem if your PK is a single column int, but if it's multiple columns, that often results in unnecessary bloat in your secondary indexes. Ideally (as in, dare I say, MS SQL Server), you'd have the option of a clustered or non-clustered PK for your table so you could choose the optimal index structure for your workload on a per-table basis.

1 comments

If you don't want a clustered index in InnoDB you can define the primary key as an auto incrementing uint.
Yes, you can, but it doesn't change the fact that you still have a clustered index (an index organized table), which is great for PK lookups, but bad if you do a secondary indexes lookup (because you need to lookup through 2 B-trees instead of 1). There is real, and well-known, tradeoff here.
You missed the point of my post. You are going to have one of the two issues, either looking through two index or indexes including the a large PK. At least with InnoDB you can make the choice. The strategy I suggested gets you the desired outcome of not including a large PK in all secondary indexes.
> The strategy I suggested gets you the desired outcome of not including a large PK in all secondary indexes.

For an application in which most queries need a secondary index lookup, using heap organized tables is more efficient because the database needs to traverse only one B-tree (for the secondary index) that gives the physical position of the row in the heap. When using index organized tables, the database needs to traverse 2 B-trees (the secondary index first, then the primary index). Making the primary key short by using an auto incrementing integer helps, but doesn't remove this overhead.

The other part of the tradeoff is that inserts and many other write operations are less expensive in heap tables. A Big Table in InnoDB, measured in "when do I start having to spend a lot of time troubleshooting this table's performance" is about 1% the size of a Big Table in Postgres. TokuDB was introduced for MySQL for a reason.

Heap vs. Index organization is a classic tradeoff of database design.

Now, if you're saying "it would be really nice if Postgres had the option of index-organized tables" I'd agree with you. I'd love to have that, as an option.