Hacker News new | ask | show | jobs
by asah 2953 days ago
Great to see MySQL adding this stuff! There's still a ton of reasons to choose Postgres and more and more silicon valley startups seem to be choosing pg - I don't remember the last time I met a startup choosing MySQL.

DBMSs are giant complex pieces of software with a million features - it's really hard to compare them. But if I had to sum it up, you can dump freaking line noise into Postgres and then hide the nastiness and manage and query it like a well-designed database. If you have a pretty database, good for you, but today's app writers have gotten lazy with NoSQL record stores and their databases resemble vomitoriums - and let's not talk about what people used to do in the 80s and early 90s.

Without further ado, Postgres vomitorium cleanup features:

- user defined functions/aggegrates/windowfuncs in your favorite language incl JavaScript, which means you can write tricky business logic once and run it where the data is, vs pulling out millions of records from the database. Language list: https://www.postgresql.org/docs/10/static/external-pl.html

- foreign data wrappers with hundreds of connectors AND a 5 minute toolkit for authoring new wrappers in python and other scripting languages. https://wiki.postgresql.org/wiki/Foreign_data_wrappers http://multicorn.org/

- index goddamned anything. Postgres has the most array of index types of any open source database AND if you need, you can easily write a function (in javascript or python!!!) and create an index that's the result of that function call. Postgres even has a full range of partial indices and block range indices, which make it practical to index massive and sparse datasets. https://www.postgresql.org/docs/current/static/indexes-parti... https://www.postgresql.org/docs/current/static/indexes-expre... https://www.postgresql.org/docs/current/static/brin-intro.ht...

EXAMPLE: I was once handed a MySQL database of IoT signals where timestamps were in seconds since the epoch and asked to report on this data without changing the database. No biggie: 15 minutes to replicate this OLTP database to a read-only Postgres replica (dumb schema mapping) - then a Postgres VIEW to hide this nastiness and a function index on the timestamp column (to_timestamp).

(from memory) CREATE VIEW foo AS SELECT *, to_timestamp(mysql_ts_col) as ts_timestamp FROM replicated_mysql_table; CREATE INDEX foo_ts_inx ON replicated_mysql_table(to_timestamp(mysql_ts_col));

At another company, we got dumped a load of JSON and weren't sure how we'd need to parse it. No biggie, I just created indices using function calls that parsed the JSON.

- tons of native datatypes and extensible datatypes. In cases where you're handed complex structures or "weird" data that doesn't behave like most programmers expect, you can define new datatypes, then create a library of user defined functions around them. https://www.postgresql.org/docs/9.5/static/xtypes.html

- sampling. Postgres has native, low-level support for queries that sample the data, which makes it super fast to explore data while preserving some semblance of statistics. https://blog.2ndquadrant.com/tablesample-in-postgresql-9-5-2...

- EXPLAIN. The Postgres planner/optimizer is still the king at explaining why your query is taking forever and what you can do about it. Admittedly, this stuff quickly gets arcane, but you can post your EXPLAIN output to a forum and guys like me will tell you how to override the JOIN order, update statistics, etc. https://www.postgresql.org/docs/current/static/using-explain...

Finally, no conversation about Postgres vs <x> is complete without mentioning that ALMOST EVERY FEATURE IN POSTGRES WORKS WITH EVERY OTHER FEATURE, which means you don't waste hours investing in something and then "oops" your carefully written user defined function can't be invoked in some obscure place - with Postgres, you can assume everything just works and will keep working. There's few pieces of software that can claim this.

2 comments

> EXAMPLE: I was once handed a MySQL database of IoT signals where timestamps were in seconds since the epoch and asked to report on this data without changing the database

I'm not following why you moved the data into postgres other than to say you did? Are you suggesting that because you were restricted from making schema changes to the MySQL instance that that's a reason why postgres is superior?

The reason for the migration was to be able to leverage the Functional/Expression Index[1] capability of Postgres. Any type of aggregated time series reporting queries were likely painful from a performance perspective, since an index on the epoch timestamp would have limited usefulness from an optimization standpoint.

So instead, he leveraged the expression index functionality of Postgres to pre-materialize an index against the converted timestamp. He didn't touch the table structure itself so it's transparent, but gets the performance benefits of that index already existing.

MySQL doesn't support function based indexes directly, although you can achieve a similar result in newer versions of MySQL with an intermediate step. You can create a Generated Column[2] first, and then build an index against that. If you specify it as a virtual generated column, then it's essentially the same as the above process where the column isn't physically stored, but you can index it. That said, asah may still not have been able to do that if the version of MySQL was too old or even that level of schema change was not allowed.

[1] https://www.postgresql.org/docs/current/static/indexes-expre...

[2] https://dev.mysql.com/doc/refman/5.7/en/create-table-generat...

I read it as he had no way to alter mysql, and he needed to work with time functions, which is tricky without a real timestamp instead of an integer, and is impossible without correct indexes.

To me it seems like, in that case, he chose postgres since he could sync the data and use a custom schema over it, with proper types and indexes.

> their databases resemble vomitoriums

Curious what you mean by this. Vomitoriums had nothing to do with vomit, except in an etymological sense.