Great to see MySQL adding this stuff! There's still a ton of reasons to choose Postgres and more and more silicon valley startups seem to be choosing pg - I don't remember the last time I met a startup choosing MySQL.
DBMSs are giant complex pieces of software with a million features - it's really hard to compare them. But if I had to sum it up, you can dump freaking line noise into Postgres and then hide the nastiness and manage and query it like a well-designed database. If you have a pretty database, good for you, but today's app writers have gotten lazy with NoSQL record stores and their databases resemble vomitoriums - and let's not talk about what people used to do in the 80s and early 90s.
Without further ado, Postgres vomitorium cleanup features:
- user defined functions/aggegrates/windowfuncs in your favorite language incl JavaScript, which means you can write tricky business logic once and run it where the data is, vs pulling out millions of records from the database. Language list: https://www.postgresql.org/docs/10/static/external-pl.html
EXAMPLE: I was once handed a MySQL database of IoT signals where timestamps were in seconds since the epoch and asked to report on this data without changing the database. No biggie: 15 minutes to replicate this OLTP database to a read-only Postgres replica (dumb schema mapping) - then a Postgres VIEW to hide this nastiness and a function index on the timestamp column (to_timestamp).
(from memory)
CREATE VIEW foo AS SELECT *, to_timestamp(mysql_ts_col) as ts_timestamp FROM replicated_mysql_table;
CREATE INDEX foo_ts_inx ON replicated_mysql_table(to_timestamp(mysql_ts_col));
At another company, we got dumped a load of JSON and weren't sure how we'd need to parse it. No biggie, I just created indices using function calls that parsed the JSON.
- tons of native datatypes and extensible datatypes. In cases where you're handed complex structures or "weird" data that doesn't behave like most programmers expect, you can define new datatypes, then create a library of user defined functions around them. https://www.postgresql.org/docs/9.5/static/xtypes.html
- EXPLAIN. The Postgres planner/optimizer is still the king at explaining why your query is taking forever and what you can do about it. Admittedly, this stuff quickly gets arcane, but you can post your EXPLAIN output to a forum and guys like me will tell you how to override the JOIN order, update statistics, etc. https://www.postgresql.org/docs/current/static/using-explain...
Finally, no conversation about Postgres vs <x> is complete without mentioning that ALMOST EVERY FEATURE IN POSTGRES WORKS WITH EVERY OTHER FEATURE, which means you don't waste hours investing in something and then "oops" your carefully written user defined function can't be invoked in some obscure place - with Postgres, you can assume everything just works and will keep working. There's few pieces of software that can claim this.
> EXAMPLE: I was once handed a MySQL database of IoT signals where timestamps were in seconds since the epoch and asked to report on this data without changing the database
I'm not following why you moved the data into postgres other than to say you did? Are you suggesting that because you were restricted from making schema changes to the MySQL instance that that's a reason why postgres is superior?
The reason for the migration was to be able to leverage the Functional/Expression Index[1] capability of Postgres. Any type of aggregated time series reporting queries were likely painful from a performance perspective, since an index on the epoch timestamp would have limited usefulness from an optimization standpoint.
So instead, he leveraged the expression index functionality of Postgres to pre-materialize an index against the converted timestamp. He didn't touch the table structure itself so it's transparent, but gets the performance benefits of that index already existing.
MySQL doesn't support function based indexes directly, although you can achieve a similar result in newer versions of MySQL with an intermediate step. You can create a Generated Column[2] first, and then build an index against that. If you specify it as a virtual generated column, then it's essentially the same as the above process where the column isn't physically stored, but you can index it. That said, asah may still not have been able to do that if the version of MySQL was too old or even that level of schema change was not allowed.
I read it as he had no way to alter mysql, and he needed to work with time functions, which is tricky without a real timestamp instead of an integer, and is impossible without correct indexes.
To me it seems like, in that case, he chose postgres since he could sync the data and use a custom schema over it, with proper types and indexes.
Interesting but obviously biased in favour of MySQL.
I disagree with some Postgres (which I know v well not MySQL though) comments. Wrong everyone does not use Sequential Integer Primary Keys, because they are completely useless as a Set constraint, I use real world logic. Update is irrelevant to me, so Vacuum not a problem, I use append only ie immutable database, the upside (not mentioned) is that this architecture makes PostgreSQL rock solid, can stay up for years unlike for example MS SQL Server that requires regular reboots and often fails to come backup cleanly. Also provide DDL transactions with rollback.
I’m curious does MySQL now have Check Constraints, Arrays, Drop Schema with cascade (a huge timesaver), V8 PL, Lateral Joins, Range Types, Custom Aggregates (so powerful) SQL Function inlining, Parallel Query, Functional Indexes, Exclusion Constraints (eg no period overlap), Notify/Listen, Foreign Data Wrappers?
The connection comments are a bit dubious. MySQL will use less memory for 1000 connections but performance will still drop due to contention and context switching. In both systems you want a small number of connections to the actual database, something on the order of 1-2x cpu cores usually, and something on top pooling client connections if you need a lot of them, pgbouncer or the equivalent for MySQL.
I had some real issues with mysql handling more than 2000 connections. Once past that limit, cpu usage increases exponentially with few connections due to context switching. At 3000 connections, 80% of thr cpu was used gor context switching and it got unusable. That was a 64 core/256gb ram server.
I ended up reducing the number of clients. In my case I had a thousand servers, and I was able to change the application structure and merge them into a dozen big application servers. Now with connection pooling each server has less than 100 connections.
As a more permanent solution for scaling, I'm moving out of mysql into something more distributed.
The article contains a footnote about UUIDs as primary keys.
> UUID as a primary key is a terrible idea, by the way — cryptographic randomness is utterly designed to kill locality of reference, hence the performance penalty
Is there anyone who can go a little bit more in detail?
We planned to migrate our database to use UUIDs as primary keys. This will allow creating new rows on clients knowing the new primary key before sending them to the server (simplifying client and server code).
(The author is assuming that the primary key controls disk layout, which is usually true.) One advantage of using an incrementing integer is that rows will be ordered on disk based on when they were created. This often helps performance. If a query asks for 25 consecutive rows, there is a good chance they will all be on the same page. If you use UUIDs, then they could be on 25 different pages and you will have to do 25x the disk IO to handle the query.
> One advantage of using an incrementing integer is that rows will be ordered on disk based on when they were created.
Well, kind of. A lot of people think the auto incrementing integer function in many RDBMSs will always increase, or will never have gaps. It's likely but not guaranteed that n+k was created after n. If you really need to store the creation date, then you should store that in a datetime/timestamp column.
> If a query asks for 25 consecutive rows, there is a good chance they will all be on the same page. If you use UUIDs, then they could be on 25 different pages and you will have to do 25x the disk IO to handle the query.
This is true, but it also means that if you need to write 25 different rows, it will be in 25 different pages. That sounds bad because non-sequential writes are slower, but you have to remember that it could be 25 different connections trying to write! In other words, you create a hot spot with sequential inserts. If that's the end of the table, you'll have threads constantly waiting for other processes to do inserts since inserts lock the page being inserted.
So, yes, clustering on a UUID can cause problems (fragmented indexes, inefficient reads), but clustering on an autoincrement can also cause issues depending on your work load.
In reality, what you need to do (in the general case) is cluster on your business key even if it's not the primary key for your table.
> It's likely but not guaranteed that n+k was created after n.
This is true in mysql if you rollback a transaction, or use a INSERT INTO ... ON DUPLICATE KEY UPDATE.
In the first, the rollback doesn't revert the sequence, in the second the "insert part" will always increase the number, even if there is a duplicate to update.
My point is that nothing stops you from modifying the value of an auto increment column, nor from inserting directly with a specific value. Yes, rollbacks don't roll back consumed values, but an auto increment column isn't immutable and the table isn't required to use the next value.
I've seen an application do things like an INSERT ROLLBACK SELECT LAST_INSERT_ID() to "reserve" IDs... or even perfectly acceptable things like reserving IDs out of a SEQUENCE. Those weren't all MySQL systems, but it did lead to confusion sometimes why gaps might appear or why timestamps might be "inconsistent".
The above one was a potential problem through 5.7 though, as it was possible to reuse some values since MySQL kept the auto increment value in memory only. INSERT followed by a ROLLBACK, then restart the server and you could get reused IDs. It's rare, but I've seen it. However, but it looks like they changed it with 8.0 to save the auto increment value to a system table now. That's a good thing.
> One advantage of using an incrementing integer is that rows will be ordered on disk based on when they were created
If each identifier starts with a logical time, say lamport timestamp, you can still get the same ordering effect without incrementing integers in a centralized place somewhere.
I've written my own UUID generation function which uses the current timestamp with microsecond precision for the first 64 bits and a random value for the last 64 bits. So far it's been a great success.
Collisions are extremely unlikely unless you have Google scale and generated UUIDs are mostly ordered.
I've also done this at a medium ish scale. We had to be careful with how the uuids were generated, though. Specifically which portion of the bytes contained the timestamp (and in what order) since different databases store uuids differently.
> Is there anyone who can go a little bit more in detail?
* UUIDs are way more painful than serials to recognise, remember, input or transmit especially if you're not dealing with huge tables. "18574" is easy to read/grok, "21caeffa-0fca-4f4e-b845-46ef0576e42a" is not.
* UUID are 128 bit instead of 32 for most serial PKs by default, this may or may not matter. Note that this doesn't just impact the table itself (lowering data locality and thus performanes: less stuff fits into caches) but also any FK as well as wire transmission (where the size explodes as you're going to transmit a hex version of the UUID so always at least 32 bytes, a decimalised u32 maxes out at 10 bytes).
* because UUID are random data they're intrinsically non-ordered (as opposed to serial ids which… are), this means your writes are all over the place and clustering is defeated, this adversely impacts your reads and writes in SQL dbs (some other DB techs especially distributed ones prefer the scattering: https://news.ycombinator.com/item?id=14524174)
The latter can he mitigated by using "ordered UUIDs": you can generate UUID1 (nominally time-based) such that the final value has a sequential "head" and a random "tail", either by taking control over the process or by generating a regular UUID1 and rejiggering it a bit: https://stackoverflow.com/questions/412341/how-should-i-stor...
> UUIDs are way more painful than serials to recognise, remember, input or transmit...
Completely agree, but I've found this can be a non-technical "feature" too. Serial integer primary keys are much more susceptible to human error when doing any sort of direct database manipulation.
Make a typo on a integer PK? Wrong user gets deleted. UUID typo? Row not found (almost certainly).
Another source of error I've seen is when someone in sales asks "Hey, can you remove User #1234?", but they really meant Customer #1234." With UUIDs, there's no "collision" between the tables.
Clearly there are better process/tool-based ways to prevent these types of mistakes, but it's a useful side effect of UUIDs.
You can somewhat mitigate the typo problem with integers by encoding them to the outside with parity included.
An 8 bit parity should be able to easily tell any possible typo in a 32 or 64bit integer and even correct errors. You could even put the parity into the lowest 8 bit of the integer.
I'm currently working on this for a project of mine to not only prevent typos but tolerate them by using the parity and using letters that are far apart on my QWERTZ keyboard (bit local but it should work for most keyboard sets and I have parity to fall back on)
> UUIDs are way more painful than serials to recognise, remember, input or transmit especially if you're not dealing with huge tables. "18574" is easy to read/grok, "21caeffa-0fca-4f4e-b845-46ef0576e42a" is not.
One of the reasons we use them is because theyre not easily recognized or sequential.
> The latter can he mitigated by using "ordered UUIDs": you can generate UUID1 (nominally time-based) such that the final value has a sequential "head" and a random "tail"
Specifically, in PostgreSQL such UUIDs can be generated using uuid_generate_v1mc()
I would suggest to write your own UUID generator. Instagram published their own approach [0], we are using something very similar to it. It seems to originate at Facebook, as the IDs generated from both instagram and FB are very similar. It can be completely automated in Postgres (probably in Mysql too).
On the plus side, because it's timestamp based, you can use the generated IDs in sorting and paging as there is a guarantee that each passing second will yield larger bigint. One caveat. If you are going to use it in Javascript, make sure you send it as string as Javascript only supports 53bit integers (due the fact that all integers in Javascript are floating points).
The one time I used UUIDs as primary keys I quickly regretted it. They're huge, they're not possible to manually enter ("which row ID was causing that bug?") and they make foreign key relationships ugly and large too.
If you want to generate IDs independently of the database you can do so using a "ID generator" mechanism.
Set up three redis instances (or MySQL or anything else that can increment a counter). Have each one increment by three each time. Start then at 0, 1 and 2.
Now you can ask any of those theee instances for a new ID and you'll get one that has not been used before, thanks to them being offset from each other.
I first saw this technique used by Flickr when they switched to a Shaffer database.
I believe one of the sought after features when people want to generate ids outside the database is for offline apps, so there is no way to connect to outside counters. The client should have a way to locally generate an id that is final and will be used for the rest of the entity's life. This simplifies the synchronization with the backend because otherwise there must be a (temporary) client id and a final true id. This leads to very convoluted code that deals with two logically different entities, the "pending" ones and the "synced" ones.
UUID (or GUID on SqlServer) are essentially 128-bit integers. The fact they don't have inherent meaning beyond uniquely identifying a row and establishing referential integrity is a plus. In my opinion the primary key is for the database's needs; a human-friendly number or code for lookup is both optional and trivial to implement so when there's a requirement for an easy reference number or code for a record I use both a GUID for the primary key and a unique (or autonumber) field for human reference.
There are functional benefits for having UUIDs as a primary key, but yes there are performance impacts on writes and ORDER BY. The best way to find out how it will impact your application is to have performance tests in place, and test out the primary key change in a development environment. I do not think you'll able to determine the impact on performance/scalability based on "pure thought".
Why would you want to ORDER BY on a UUID field? Not trolling, I honestly can't think of a reason why you would want to do this. Secondly, aren't UUIDs treated by the database engine as a 128-bit integer? If they are being treated as varchar fields then I can see how this would affect performance negatively but again, I question if this is really the case.
For random (and not timespace prefixed) uuids, you can end up hitting more blocks because if locality of reference, if you are using b+trees. If you are using an lsm index, you get blocks of data written at the same time in the index anyway, so your "slow" disk isn't so bad, because that is in your cache already. For b+trees and random uuids, data in blocks are basically scattered everywhere. So your index lookup of 1 billion items could hit 1 billion leaf blocks, instead of 1 billion / entries per leaf.
Why not generate an UUID field that’s unique, but keep a surrogate integer primary key on the server? You can join on either one on the server, but keep track of the rows using the UUID. Asking as someone with no SQL expertise (relative to the HN crowd).
That's generally the route I've gone in the past, particularly for complex data models. The entity has a uuid field with a unique constraint along with an auto-incrementing int/bigint for a primary key. The UUID is what gets exposed for any external/API usage, but all internal join logic leverages the int primary key.
That way you mitigate the potential performance impact of doing complex joins on UUID fields. While also gaining a bit of flexibility in any future change management process by decoupling your internal database ids from any publicly exposed id. So instead of e.g. having to coordinate with external applications to ensure things don't break when you switch your id from an int to a bigint, you keep the uuid consistent and internal database optimizations and changes stay transparent to stakeholders of the database.
I use UUIDs in mysql as primary keys in tables with a few million records, and I'm about to migrate a few tables with billions of records from bigint to uuid.
Never saw any real problem. I use char(36), as it's easier to query manually when needed, but I' looking into binary(16) for those billion row tables.
I think most issues with fragmented tables are old problems since ssd, and the overhead is something you will only notice in benchmarks.
If your keys ate supposed to be uuids, the just use them and get the hardware to handle it. In reality, you're most likely to be affected by a zillion other things before an uuid as pk.
> I use char(36), as it's easier to query manually when needed, but I' looking into binary(16) for those billion row tables.
I would use whatever data type your RBDMS's UUID generator returns or the programming language your application is written in. If your RDBMS supports a UUID or GUID data type, however, I would 100% use that because you'll invariably have functions which help you deal with it.
Remember, however, that many (most?) RDBMSs store records in pages (or blocks) of a fixed size typically between 4KB or 8 KB, and they won't allow a record to span a page (usually when a record is too long for one page, non-key data will be moved to non-paged storage which is slower). In other words, if you reduce your record size by 20 bytes you might not actually see as big a change as you'd expect. You'd be storing less data per record, but you're maybe not changing the records per page. You're not increasing the efficiency of your data store at all because of how the data is physically stored. It also means that the answer might be different for each table since each table has a different row size.
Bottom line, however, is that I would favor storing UUIDs the way your particular RDBMS vendor tells you they should be stored. If your application has particular problems with storing UUIDs that way I would look at alternatives, but generally the RDBMS vendors have thought about this a little bit at least.
MySql doesn't have a UUID data type, the UUID() function returns a varchar. The way you store it is mostly preference and driver defaults. The C# driver used to handle binary(16) as Guids, then they deprecated that in favor of CHAR(36). But when dealing with a bilion rows, each byte counts and I'll favor binary(16) because it's smaller and that helps with the index sizes and memory usage.
I can't comment on uuid as I never used that for ids. But, I can tell you that asking for a next ID from a postresql primary key sequence will make sure it is never used again because it is an atomic operation. Once a next value is requested and you keep it, you can be sure it will not be generated again. I've used this for many things.
we use both uuid's created in our application layer and stored as char(35) and also uuid's produced in the database stored as uuid data types. We have tens of millions of records and have no issues. Once we get upgraded to PG10 and can use hash indexes for them we expect a further speedup.
I don't know anything about the author, but every time in the past ive heard someone say that uuid's as keys are a problem it turns out theyve never actually tried it, theyre just saying what theyve heard in the past. Theoretically they should be worse than they are - and theres no doubt that a regular int/bigint would be faster - but the truth is there are so many other things that are going to slow you down before that.
uuid's aren't guaranteed to be unique at generation, there is still a non-zero chance of it having a collision. using it as a primary key to be generated by the database helps mitigate that, as there will normally be a uniqueness clause on the index.
creating that uuid on the client likely will not accomplish what you're hoping.
The rarity of UUID collisions depends on the quality of implementation. The Quora answers assume UUIDv4 with a high-quality RNG.
I was able to reproduce UUIDv1 collisions at will when the timestamp had microsecond resolution and the clock sequence had to be generated randomly each time. That is, I simply had to get two processes to generate the same fourteen bits within a microsecond.
That only seems possible if both processes were running on the same physical machine (and hence share the same input MAC address).
It's a nasty corner case, but I'm sure the UUID designers considered it a valid tradeoff, since I believe the algorithm was designed for a distributed scenario. In the single-machine case an atomic counter would be a much easier solution with very reasonable efficiency anyway. Still, it might have been clever to also include the local process id in the UUID, I wonder why they didn't to that.
At any rate the problem is easily worked around by running the two processes on different machines, i.e. ensuring you have at most one UUID generating process per host (with respect to the database table in question).
If you generate 1 billion UUIDv4s a second, it would take, on average, 85 years for you to produce a duplicate, and the resulting list of UUIDs would take up ~45 exabytes. And keep in mind that even if inserting a row fails because you've somehow managed to generate a duplicate UUID, it is trivial to make a new UUID and retry. Since the database enforces the uniqueness constraints of primary keys, I'm hard pressed to come up with a scenario in which generating a duplicate UUID would actually do anything serious.
GP pointed out the special version of the algorithm that handles collisions gracefully. Note that this algorithm ("open addressing") is so frequently used that you will probably find a variation of it in pretty much any piece of software you have ever used. It's a well-understood method, not only from a practical perspective but also in terms of theory; check out e.g.: https://en.wikipedia.org/wiki/Linear_probing
> uuid's aren't guaranteed to be unique at generation
If they are not unique, they are not really UUIDs. In which case you should tweak the algorithm to make uniqueness guaranteed. Like add a client id and its logical time in there.
The difference here seems to be the exact definition of "unique".
A UUID has a finite length, so if we generate N new UUIDs in a finite time-interval it seems clear that - in theory - we must have collisions for large values of N (or even infinite N), regardless of how clever we are in seeding our random number generator. But I don't think this can be fixed without using a variable length value or refusing to mint new UUIDs once the available bits are used up.
Of course in practice that should not really happen; at least if we only run on hardware from vendors where we can assume that every MAC address will be unique, the only way to actually get a colission - in practice - would be by generating more than 2^B UUIDs on the same machine within a fairly short timeframe and fairly large B.
EDIT: In v1 of the algorithm it seems that B=14 and the "short timeframe" is the smallest resolution increment your system clock supports. That is if we assume the "uniquifying" clock sequence is produced by incrementing a counter. So we can say that for practical purposes, a collision is impossible on a modern system unless we have invalid MAC assignments to our hardware, unreasonable transaction rate, or an incorrect implementation of the UUID algorithm.
> by generating more than 2^B UUIDs on the same machine within a fairly short timeframe and fairly large B
Which can be made impossible.
IDs are part of the finite system and don't exist on their own. And the system can perform a finite number of operations both concurrently and in a finite time-interval with some amount of uniqueness distinguishing nodes and other useful properties. Making it possible to always find an id generating algorithm for this system that can never have collisions.
I agree, if we are allowed to make the UUID arbitrarily large (and not be limited to the 128 bits from the "official" UUID algorithm), it should always be possible to set B to a large enough value so that our scheme assigns a unique and finite-sized UUID to every possible state of the system, since all the system's parameters (number of servers, time, etc) should be discrete and finite. I.e. basically make B large enough that 2^B becomes greater than the number of discrete timesteps in our experiment. That's an interesting observation.
I like that PostgreSQL can have both relational table and JSONB document collections (NoSQL) in the same database. Use NoSQL where it makes sense and relational tables for data that is inherently relational and query and join both (or batch-process from one to the other). I find this very cool.
Of course I wonder if it's too much cool and in trying to do everything it's falling short in some significant and fundamental way.
Wow, I had no idea it could store jsonb. Any obvious advantages to using mongodb for nosql that I should consider? I’m way more comfortable with Postgres and would rather stick to that
Keep in mind that JSONB is different than JSON in PG. The later will not transform the JSON simply check consistency and has limits and performance hits.
JSONB is encoded and will transform your JSON, it will drop duplicate keys and order might not be preserved. But it's also faster and easier to use (IMO).
There is no real reason to use MongoDB tbh, PG JSONB can be handled like any other field and you can even index into JSON (partial indices even; only index rows where a field is present in the JSONB)
Warning: I haven't used MongoDB in several years, so this info can be outdated!
The main advantage of using JSONB on Postgres over MongoDB is that you can create tables that mix regular fields (varchar, number, date, etc...) with JSONB fields.
Then you can do joins of your table with other tables (no need to map/reduce or other insane processing for a simple join).
That actually sounds really cool. Honestly, even if Mongodb has all those features, I would much rather do everything on postgres seeing as I'm a hundred times more familiar with it than mongo.
I'm a big fan of PostgreSQL, but don't you think MongoDB can be useful when you outgrow a single machine (vertical scaling is not possible anymore) and you need a sharded cluster (and you don't need joins and transactions...)? This is the only situation where MongoDB makes sense, maybe. But even though, I'd probably look at Citus instead.
If you have objects that are happily sharded arbitrarily across machines, don't need joins, transactions, or aggregates calculated on the server, then what are you getting over something like (for example) open-source Redis Cluster?
I ask, because if I spend a few hours designing in advance, and write a bit of code, I can get Redis to do much of what I need in such scenarios (counters, aggregates, statistics, indexes, queues, ...), and being that I wrote a book on Redis, task queues, object mapper, well, I'm going to use that instead (and use some of the public domain / open-source code I've already written).
Also, with my work on real Redis transactions (which I've made work across Redis Cluster) means that I don't even need to give up ACID transactions in Redis, regardless of scale.
Once I need more; in the form of post-hoc analysis, joins, group-by, aggregates, etc., at scale; either I can easily export from Redis into logfiles to csv/tsv/json for Spark, Python + Pandas, and/or Redshift if I've got the $, or at the same time just use pgloader into Postgres and live there.
I haven't mucked about with Postgres foreign data wrappers much, but there is a Redis one available, so maybe I can even drop that Redis -> S3/csv/tsv/json, and get everything I want (direct data structure manipulation in Redis + everything Postgres has).
So yeah. I generally solve my problems with a bit more design in advance, and MongoDB doesn't really have anything to design for/against; you get objects and indexes. Which are usually not as good as Postgres equivalents (Postgres json objects are better than MongoDB, just by themselves, and I'm not the first/only person to say it). And what I get from Redis (raw data structures, 1 million ops/second/core) means that for cases where other folks may use MongoDB, I use Redis. Then I use Postgres for basically everything else.
So yeah, I don't use MongoDB. Postgres for almost everything, and Redis for the cases where Postgres doesn't feel like quite the right fit.
> If you have objects that are happily sharded arbitrarily across machines, don't need joins, transactions, or aggregates calculated on the server, then what are you getting over something like (for example) open-source Redis Cluster?
The only reason I can see to prefer MongoDB over Redis Cluster in this case (no joins, no transactions, no aggregations) is if the dataset doesn't fit in memory. Except that, I think you're right to prefer Redis.
Your comment is a really interesting comparison of Redis and MongoDB. Never thought about that before. Thanks!
1. MongoDB is different from PostgreSQL in more ways than sharded clusters.
2. Pg has variants (e.g., Postgres-XL) and extensions (e.g., CitusDB, as you mention) and methods (e.g., postgres-fdw, pgBouncer etc) that let you keep using quite a lot of Pg features with your data horizontally distributed across machines.
3. If you still want automatically managed sharding, there are quite a few databases (SQL: CockroachDB and NoSQL: Cassandra, FoundationDB) better than MongoDB.
jsonb functionality in PG is fantastic. It does everything youd expect, plus you can do a lot of other nifty things with it that you normally cant do with relational dbs very easily.
> [2] When I say Postgres is great for analytics, I mean it. In case you don’t know about TimescaleDB, it’s a wrapper on top of PostgreSQL that allows you to INSERT 1 million records per second, 100+ billion rows per server. Crazy stuff. No wonder why Amazon chose PostgreSQL as its base for Redshift.
Correction: Amazon chose ParAccel, which was a data warehouse forked from PostgreSQL.
Many data warehouse products have followed this path due to licensing. MySQL is GPLv2 which means you can't ship derivative works without releasing your code. PostgreSQL has a permissive license similar to MIT/BSD. You can do anything you want with the code. That's still a major consideration which the article omitted.
(Cross-posted from another HN link to same article.)
Also, inserting directly into Redshift is strongly discouraged as it's extremely non-performant.
>An anti-pattern is to insert data directly into Amazon Redshift, with single record inserts or the use of a multi-value INSERT statement, which allows up to 16 MB of data to be inserted at one time. These are leader node–based operations, and can create significant performance bottlenecks by maxing out the leader node network as data is distributed by the leader to the compute nodes.
Postgresql is good for analytics, but it doesn't scale really well with a lot of data. I have moved my analytics to Clickhouse, 1000x better performance.
What about upgrading to a newer version of PostgreSQL ? Does that still require upgrading the whole databases ?
I evaluated PostgreSQL several times in the past, and cancelled once I found out that upgrading to a new version requires upgrading the whole databases - our databases are too big and our uptime requirement are too strict, we can not afford it
You can use logical replication between 2 versions and fail over to the new version without any downtime . In pg 10+ it is built into core, for earlier versions you can use pglogical or similar tools.
this states all of the drawbacks to process vs thread but none of the benefits (resiliency / compartmentalization of errors, less need for lock coordination and less risk of locking related bottlenecks with scale, somewhat better host OS CPU & IO utilization, etc. )
> With a clustered index, when you look up a record by the primary key
To nit-pic - it might be the case in mySQL but some DBs (SQL Server for instance) allow the clustering key to be something other than the primary key, and for some analytical workloads this can be much more efficient.
You're correct that PostgreSQL has had clustered indexes for quite a long time. The only difference here is that PostgreSQL allows a HEAP table. That said, you do have to re-cluster a PostgreSQL and that requires an exclusive lock on the table, which is obviously not idea for a massive table. MySQL will always cluster on the PRIMARY KEY or (if there isn't one) the first UNIQUE key[0], but as far as I can tell it always clusters on write. I don't see any way to configure the padding of the clustered index like you can on SQL Server, so I'm not sure how this is accomplished.
MySQL only supports SQL in procedures[1]. There is no PL/pgSQL[2] equivalent, and except for custom UDFs written in C/C++, there's no support for external procedures, either. PostgreSQL[3] supports PL/pgSQL, PL/Python, PL/Tcl, and PL/Perl in base, plus there's external modules for PL/Java, PL/Lua, PL/R, PL/sh, and PL/v8.
This is not the same thing - it is rearranging an existing index for efficiency as a one-off process. It needs to be repeated when the data is substantially changed.
With a true clustered index the clustering property is as far as possible (it can get somewhat fragmented in the presence of random data) maintained during normal operation without the need for a full rebuild every now and then to keep the benefits for new data.
> When a table is being clustered, an ACCESS EXCLUSIVE lock is acquired on it. This prevents any other database operations (both reads and writes) from operating on the table until the CLUSTER is finished.
This makes that operation very nasty. For a large amount of data you are looking at locking your applications out of the database for some time, and the delay is relative to the total data size in the table being acted upon, not the amount of data that has recently arrived or changed.
I think it's different. From the docs, this is a one time update to the table. The table itself is not clustered, it's just ordered based on a clustered index, so selects by default should come in that order. But once an insert is done, it's again ouy of order.
Also, an order by using the cluster field most probably invokes an index scan, while a clustered table doesn't.
DBMSs are giant complex pieces of software with a million features - it's really hard to compare them. But if I had to sum it up, you can dump freaking line noise into Postgres and then hide the nastiness and manage and query it like a well-designed database. If you have a pretty database, good for you, but today's app writers have gotten lazy with NoSQL record stores and their databases resemble vomitoriums - and let's not talk about what people used to do in the 80s and early 90s.
Without further ado, Postgres vomitorium cleanup features:
- user defined functions/aggegrates/windowfuncs in your favorite language incl JavaScript, which means you can write tricky business logic once and run it where the data is, vs pulling out millions of records from the database. Language list: https://www.postgresql.org/docs/10/static/external-pl.html
- foreign data wrappers with hundreds of connectors AND a 5 minute toolkit for authoring new wrappers in python and other scripting languages. https://wiki.postgresql.org/wiki/Foreign_data_wrappers http://multicorn.org/
- index goddamned anything. Postgres has the most array of index types of any open source database AND if you need, you can easily write a function (in javascript or python!!!) and create an index that's the result of that function call. Postgres even has a full range of partial indices and block range indices, which make it practical to index massive and sparse datasets. https://www.postgresql.org/docs/current/static/indexes-parti... https://www.postgresql.org/docs/current/static/indexes-expre... https://www.postgresql.org/docs/current/static/brin-intro.ht...
EXAMPLE: I was once handed a MySQL database of IoT signals where timestamps were in seconds since the epoch and asked to report on this data without changing the database. No biggie: 15 minutes to replicate this OLTP database to a read-only Postgres replica (dumb schema mapping) - then a Postgres VIEW to hide this nastiness and a function index on the timestamp column (to_timestamp).
(from memory) CREATE VIEW foo AS SELECT *, to_timestamp(mysql_ts_col) as ts_timestamp FROM replicated_mysql_table; CREATE INDEX foo_ts_inx ON replicated_mysql_table(to_timestamp(mysql_ts_col));
At another company, we got dumped a load of JSON and weren't sure how we'd need to parse it. No biggie, I just created indices using function calls that parsed the JSON.
- tons of native datatypes and extensible datatypes. In cases where you're handed complex structures or "weird" data that doesn't behave like most programmers expect, you can define new datatypes, then create a library of user defined functions around them. https://www.postgresql.org/docs/9.5/static/xtypes.html
- sampling. Postgres has native, low-level support for queries that sample the data, which makes it super fast to explore data while preserving some semblance of statistics. https://blog.2ndquadrant.com/tablesample-in-postgresql-9-5-2...
- EXPLAIN. The Postgres planner/optimizer is still the king at explaining why your query is taking forever and what you can do about it. Admittedly, this stuff quickly gets arcane, but you can post your EXPLAIN output to a forum and guys like me will tell you how to override the JOIN order, update statistics, etc. https://www.postgresql.org/docs/current/static/using-explain...
Finally, no conversation about Postgres vs <x> is complete without mentioning that ALMOST EVERY FEATURE IN POSTGRES WORKS WITH EVERY OTHER FEATURE, which means you don't waste hours investing in something and then "oops" your carefully written user defined function can't be invoked in some obscure place - with Postgres, you can assume everything just works and will keep working. There's few pieces of software that can claim this.