I've been on a project where we were forced to migrate the opposite direction: From PostgreSQL to Oracle, because the client was already paying for Oracle licenses and really, really, wanted us to use Oracle to justify the expense.
It was actually a pretty big setback. We were using PostGIS to support spatial queries (a key requirement), and Oracle Spatial was just not at the same level (both in performance and features). The development experience with Oracle was also awful. The licensing for Oracle was highly granular, down to the feature level. More than once I'd identify a feature that provided a solution to an issue through online research only to be prevented from using it due to the customer not having the requisite license for it.
And the support was useless. Oracle was so complex (by design) we resorted to contacting support a couple of times - they would send out an "engineer" who could turn any technical troubleshooting session into a sales presentation for some Oracle product or feature that would "solve" whatever the issue was.
I will never work on a project involving Oracle again (barring obscene amounts of money to assuage my frustration, of course).
Same - it's such a difficult technology to deal with when outstanding flavors of DB exist (many for free): Postgres, MySQL, MSSQL.
The Oracle projects I've worked on are just people choosing it because "it's the safe business decision and a household name". I've had semi-good luck convincing folks over to MSSQL in these circumstances which is a night-and-day improvement in ergonomics/features.
I thought MSSQL could only come with windows servers (until last year).
What is your experience with moving folks from Oracle / Linux to MSSQL / Windows?
I too think MSSQL is the best contender to replace Oracle in many regards, but I don't see the change in operations going well, different skillset and sysadmins.
I mean - it's a nightmare and it hurts a lot. My experience is always lots of planning, refactoring the original schema, and working with SWEs directly on the code. As a SWE myself, the two major Oracle to MSSQL transitions have been heavily application-specific vs. other situations that may have a large number of downstream dependencies... I guess what I'm saying is my blast-radius was limited to 2-3 services vs. some of the nightmares I know other folks deal with here.
Given that my blast-radius was small, I was capable of doing a "from the ground up" approach by building data migration code by hand (ie: manually ripping through records, often validating them, doing any transformations, and writing them to the new DB target)... the largest DB I had to move like this was <100 tables and only 15Gb - I mention this because what I'm discussing isn't actually that impressive vs. your typical Oracle migration looks like!!! I was also incredibly lucky that the jobs I accepted were only had 2-3 years worth of data in there. People reading this are like "ha - easy mode" and they're 100% right.
Now that all this being said - I know my techniques are not ideal, and it's only one way to skin the cat... which is to replace the cat outright in an incredibly tedious manual technique. But hey, I still came in on-time and under-budget twice! I also had incredible confidence in the new DB target (MSSQL) as I literally touched/vetted everything by-hand. Stupid like a FOX!
Now if you can't do a hard cutover - all of this doesn't really apply. If I would have had to make a gradual transition I would leave it to someone better suited for the task.
> but I don't see the change in operations going well, different skillset and sysadmins
Oracle can be incredibly expensive or incredibly cheap... In my situation it was small-to-medium sized companies doing simple logistics stuff. It was incredibly cost-effective to replace it with an easier-to-use DB and most were happy to invite the change + learn new tooling.
It also was a HUGE productivity boost to the devs... I don't think people realize just how dumb Oracle/Oracle products are when it comes to simply "building something". To the extent that I find Oracle's product offering to be truly offensive, as-in it offends me as an engineer that someone would think to hand me such a non-ideal tool in 2020 ugh.
EnterpriseDB seems like PostgreSQL version that can emulate most Oracle's features, although I would personally use it as a step in migrating (i.e. move from Oracle to EDB, then gradually convert your data to be PostgreSQL only, once done move to pure PostgreSQL)
With EDB ships a full featured PostgreSQL with some additional useful time-savers and a lot of experience for a fair amount of money. A part of this money goes direct into PostgreSQL development. So it's a fair deal.
I've spent a fair amount of time at the Sofitel in Redwood Shores, and I'd often chat with Oracle "sales engineers" at the bar. A simple "so, what do you work on" question would inevitably generate an hour's worth of them explaining some convoluted acronym-heavy product with a pushy sales model, and I'd eventually be like "ok... so it's a database?".
To be fair to Oracle (not something I say very often) - I suspect most of their sales are business applications (ERP, CRM, financial) that happen to use their database engine as a back end.
At Oracle everything uses Oracle DBs. Their bug DB is a thin layer on top of an Oracle DB. Their email server too. Everything that can be a thin layer on top of Oracle... is.
That's what we call throwing good money after bad. Besides, PostgreSQL has actually come a long way since 3 years ago or so. It's not a slow-moving project, especially for this space.
How often do developers recompile postgresql? How often do developers go through the hassle of upgrading databases versions? Whatever postgresql may have done recently, it won't be used and available in the common distro until a while later.
Bear in mind that minor versions in postgresql are breaking changes. It does not follow semver.
> Bear in mind that minor versions in postgresql are breaking changes. It does not follow semver.
1. This hasn't been true since three releases ago (i.e., PG 10).
2. Before PG10, X.Y _was_ the major version of PostgreSQL, Y was _not_ the minor version. It has never broken backwards compatibility without a major version change.
I don't know anyone who compiles their own Postgres at all.
AWS is a pretty common "distro" (of sorts). Postgres 12.2 was released 2020-02-13, and AWS RDS supported it on 2020-03-31. You don't have to wait very long to use this.
Uh, never? I mean, there may be cases where you might need to do that, but I don't want to use a PG built by JoeRandoMacbook somewhere. It's better to use PGDG.
> How often do developers go through the hassle of upgrading databases versions?
As often as they have to. Most places don't upgrade all that often.
I don't know what things are like in the RPM world, but for Debian each new PostgreSQL version is packaged at apt.postgresql.org as soon as it's released, by the same people, and to the same high standard, as the packages that go in the stable release.
(Also, since Postgresql 10 they changed the numbering scheme so that the major version changes for each 'breaking' release.)
Many of the cloud providers make it relatively easy to upgrade and might even force you to do so after a while (so they can drop support for old versions).
I really don't like Oracle from a DBA perspective but it's still often far ahead of PostgreSQL when it comes to query performance. In Postgres, the query structure can make huge differences in terms of performance and it can take a lot of tuning to find the right query to optimize performance (especially when subqueries are involved). Oracle (and SQLServer) are usually pretty good at optimizing the query exactly the right way, reducing development time by quite a bit.
This is actually my experience with Oracle; I had a DBA to take the management pain from us, and we had the budget for the correct licenses and hardware. The query performance was phenomenal considering the absolutely crazy amount of data we threw at that thing.
That being said, I would not recommend Oracle without all of the above factors already in place. For most use cases, Oracle is more pain than it’s worth.
As somebody who does a fair amount of DBA work, Oracle is my favorite RDBMS to operate. Low level administration is easy to manage, the optimizer is very performant, and the plan management features make tuning easy, and statistics management much less risky (and it can be very risky).
I wouldn't use it on any of my own projects though, but only because I wouldn't want to pay for it, and because I have enough faith in myself to be able to manage the trickier bits of Postgres.
To be fair, I have no idea how hard it was to manage Oracle vs. how much my DBA just liked to gripe. But I largely agree with you, except I would lean towards a managed service because I’m no DBA.
It's very true today. The problem discussed here is performance on complex queries (e.g. subqueries), and the query planner plays a huge role in that. The Postgres query planner has various issues. Here are two recent posts talking about planner issues:
Also JIT compilation, while very nice and a step in the right direction, is very barebones at the moment, hardly achieving its potential. Here's a long todo list of what and how to make efficient use of JIT in postgres, by the main author of the feature. https://twitter.com/AndresFreundTec/status/10025899696161996... Postgres release 12 did not add any JIT related improvements, and as far as I know no work has been done on it on 13 either.
The traditional view that the bottleneck for query performance is overwhelmingly disk throughput, which has become slightly less true with the advent of SSDs.
I just made a very casual search of the most recently completed commitfest and I saw one entry ("JIT expression evaluation improvements") marked as "Moved to next CF" and in the upcoming one the only item planned was the one that was moved.
I would say performance is still an issue in certain cases. We had some long-running queries with views and subqueries running on a down level version of PG. We recently attempted an upgrade to 12 to address this and take advantage of the latest RDS generations. The ideal architectural solution would be to sunset the views (involving unions) in favor of a better normalized database, but you do what you can when you inherit legacy. And we're not talking about millions of rows here. It's a relatively small installment.
Anyway, to our surprise, not only did it provide little boost, but other queries that were simpler and previously caused no issues became inexplicably and intolerably slow.
We did not have time to go down the rabbit hole of query planning and explains, and being a highly dynamically queriable app, we didn't know what else was in store for other queries based on some combination of user inputs down the road.
So we rolled back. PG has otherwise been great for us. It really is the performance issue in certain situations that has caused pain.
One more note is with regard to what I would call extreme performance variability based on row counts. What I mean is there seems to be some threshold that, once crossed, causes some queries to go from perfectly fine to near-interminable. You expect some degradation as row counts go up, but here the performance behavior suddenly degrades in a nonlinear fashion. That kind of issue is difficult to tune.
> Historically we've always materialized the full output of a CTE query
Do you know if this means that CTEs are written to disk? I didn't think this was the case. (In the case of materialized views that qualifier means the view is written to disk).
Here "materialization" just means that the query planner treats it as an optimization fence. That means the system doesn't do any optimization between the CTE and the query referencing it. It would prepare the output of the CTE as a completely separate entity essentially as if you had dumped it to a temp table. It may not be written to disk if there was sufficient memory, but either way you're sacrificing any optimization. I believe that it would even materialize the CTE multiple times if it was referenced multiple times, but don't quote me on that.
On the other hand, if the same query were written with subqueries instead of with CTEs, then the query optimizer would not treat it as an optimization fence. If it could utilize indexes or rewrite the query to be relationally equivalent, it would do so.
Note that sometimes that optimization fence is beneficial. There are situations where it's better to create temp tables and run smaller simpler queries instead of running an extremely complex monolithic query because the query planner isn't perfect even with hints. You can still enable that optimization fence functionality in PostgreSQL if you need to, but it's generally pretty rare that it happens like this. Still, you'll see stored procedures for reports still using temp tables and even cursors sometimes because they can be made to perform better in certain situations.
I left postgres v9 and have been working on Vertica. It's pretty surprising that Vertica came out with that WITH/WITHOUT MATERIALIZATION keywords 2 years ago.
Specifically for geospatial queries we found Oracle Spatial to be inferior to PostGIS (which is much more widely used). Regular relational queries seemed to be no worse (admittedly, our needs in a database were somewhat modest in that regard).
Though, most of our team's expertise as far as databases went was in Postgres (including our DBA).
I've generally found SqlServer's query optimizer to be a hellish nightmare of broken dreams, so if Postgres' is even worse I'm giving up and going back to flat files.
I've found that SQL Server's is generally pretty good, with the big pitfall being the system's timeout for the query planner/compiler. Query compilation timeouts can be really frustrating to work on because, often, the query's complexity is a requirement.
The only other problem is the parameter sniffing problem for stored procedures, although OPTIMIZE FOR UNKNOWN or specified values seem to work fairly well in my experience, though obviously not always.
The real failing is that a common solution to a view query hitting the compiler timeout is to replace it with a stored procedure of some kind. However, if you're not careful you'll run into the parameter sniffing problem with stored procedures! So you run into one caveat and your attempted solution runs into the other one.
I found SQL Server's query optimiser to be magical, but it relies on its table statistics being somewhat correct. Every now and again they liked to suddenly become wrong enough that queries go from magical to catastrophic mush.
(Most recent version I've used was SQL Server 2008.)
All cost based optimisers rely on statistics being correct. And inevitably there will be a case where they aren't. The problem with PostgreSQLs optimiser, and I assume with others too, is that it's too risk happy. It's optimizing for the average case based on the statistics, when people actually tend to care about the worst case.
As an example, say you have a database of all cars ever produced with indexes on model and production date. If you are looking for the latest Ford F-150 then the best plan is to just start looking backwards by date and you will find one soon enough. Much faster than looking up all F-150s and picking the latest one. On the other hand, if you are looking for latest Ford Model-T, that plan is going to be catastrophically terrible, going through 93 years of car production before finding the correct one.
Your example illustrates perfectly how a descriptive query language simply can't work in the general case for something other than ad-hoc workflows where performance isn't a big concern. Why are we all doing this instead of engineering DB access procedures directly? This is one thing I really liked about CouchDB and its map/reduce based access - stop trying to be smart with queries, instead be dead simple to scale horizontally. Granted, it's hard to design the data structures in the design space of normalization (many joins) vs. read performance (few joins) - but at least it's all laid bare to reason about.
> All cost based optimisers rely on statistics being correct. And inevitably there will be a case where they aren't.
Every single time the SQL Server query optimiser did something obviously wrong, rebuilding statistics fixed it. The problem I had with SQL Server wasn't its reliance of statistics—it's not like they could work any other way—it's that it failed to maintain its statistics correctly. Any time that statistics stop being correct is a bug. It should be able to maintain them itself and trigger rebuilds whenever there's any doubt about them.
No. Postgres has in-memory bitmap indexes, which are built on the fly while scanning the index and used to more efficiently combine AND/OR clauses, but that's not quite the same thing. There have been several attempts at adding on-disk bitmap index support to Postgres, but they've all been abandoned.
I think it's a company culture overall. The same experience with Oracle cloud, I saw low prices, decided to try... Their Kubernetes engine was purely horrible not even alpha comparing to Google offering. My bare-metal installation was much more mature.
A company I was working with, also did this. Their excuse? The Oracle licenses were very cheap because the CIO is a genius negotiator. Never mind that PostgresSQL is free and much better product. Either they are really stupid or they are getting something from Oracle under the table.
Oracle used to be by far and away the best database out there. Now I wouldn’t use it even if you paid me. It’s shocking how little Oracle invested in developing their products and services over the years. They are a distant second, if not merely an “also ran”, for everything that they do. The company largely exists as an experiment in just how far you can go with a vendor lock-in strategy. Sadly that experiment is proving to be a remarkably successful one
They're a victim of their own success. They became a monopoly and the quality of their product stopped mattering. They're an example of Steve Job's comments on Xerox's failure[0]. It happened at Oracle, IBM, Cisco, and Microsoft. It's happening now at Apple, Intel and Google.
I agree on the overall theory (dominance in a sector tends to shift internal incentives in such a way that the result is an ossified development structure), but I think Jobs' terminology is imprecise. Larry Ellison is not really a product guy first and foremost, he's the definition of a tough salesman. The "bad guys" are a more generic variety of "corporate type" who can materialize in any department, really. Typically they power themselves up the ladder with "cost efficiencies". The most recent Oracle CEOs (Hurd and Catz) fit that profile to a T.
En passant: another big name that suffered from this phenomenon was Nokia.
I've a feeling that Oracle succeeded by accident. They won the race to win mindshare without understanding that that's what they were doing, and have since rested on their laurels and high pressure salesmanship. You see their lack of attention to mindshare in... everything they do.
The article seems to misunderstand what table inheritance is in PostgreSQL.
CREATE TABLE new_table AS TABLE existing_table;
Doesn't create any PostgreSQL inheritance relationship between the parent and child tables. It merely makes a new non-inherited table with a copy of the data whereas with true table inheritance you're working with the same data (there's some visibility rules to consider between parent and child, but that's different than a copy).
I'm also uncomfortable with too simply stating that you should think of this like OOP inheritance; while I agree that in some respects there's passing similarity, it is its own beast and needs to be understood outside of the OOP paradigm to be useful. Many of the Object Relational aspects of PostgreSQL are very powerful, but can not be understood in OOP terms.
Also, another part of the article talks about the ramifications of not having Oracle "packages". So while it's not completely the same concept and there are different sets of trade-offs, one option includes using PostgreSQL schema for this sort of logical namespace organization. Both Oracle and PostgreSQL have the concept of different schemas, but Oracle has a much more rigid idea about schema usage (related to database users) and PostgreSQL has a much more fluid idea about usage. As a former Oracle guy, I can see how that organizational tool might not be front of mind when coming to PostgreSQL, but I've used PostgreSQL schema for this sort of organizational purpose with good success.
I'm a happy Postgres user and recently did some work with a government agency using Oracle - the thing that shocked me most about Oracle was the lack of transactional DDL operations which was something I'd just taken for granted in the Postgres world.
Coming from MySQL to Postgres a few years back the transactional DDL statements were a joy to work with - I've had to claw a legacy into the modern era and utilizing them has allowed me to execute live migrations from legacy into shims and then from shims into modern.
I also really appreciate the transactional TRUNCATE - I pretty much never use it but at least in Postgres I never have to worry about someone else trying to run one and wiping state unexpectedly.
> I also really appreciate the transactional TRUNCATE - I pretty much never use it but at least in Postgres I never have to worry about someone else trying to run one and wiping state unexpectedly.
as long as auto commit is not enabled.
These goodies are possible, because of PostgreSQL's MVCC which requires running vacuum. Nothing is for free unfortunately.
Interestingly even though MSSQL server uses an extremely different implementation of MVCC, it internally has a vacuum equivalent. (Which is required even when all MVCC support is disabled! It is used to enable efficient implementation of deletes, without having to use absurdly coarse locks).
MSSQL just handles doing that cleanup silently in the background while exposing basically no no configuration except a trace flag that can turn it off.
I don't think undo vs. heap based MVCC choice has any big impact with regards to transactional truncate, nor transactional DDL in general. You are likely to see proof of that within a couple of years.
For one, it does things that the competition is not capable of or just inferior -- and despite the bias on this site, the major consumers of RDBMSes are not price sensitive scrappy startups.
There is no comparison between the HA offerings in Oracle and something like Postgres, which are comparatively toys. Replication doesn't equate to HA and the "nobody got fired for buying xxx.." actually has some justification. Why would I risk my reputation on a flimsy solution just to save a few bucks in a large corporation?
I would argue that using postgres today sets you up better for HA tomorrow, as both yugabyte and cockroachdb are built with postgresql compatability in mind. I would also argue your reputation as a competent CTO / Architect / whatever is more at risk by choosing Oracle in 2020.
> I would argue that using postgres today sets you up better for HA tomorrow
The point is, if I have the money I can get enterprise grade HA today. Yugabyte and cockroach are promising but they’re small unproven organizations and as history has shown they’re likely to get bought out and then who knows.
Larry Ellison is an asshole, Oracle the company itself makes me want to vomit but they are a pretty known quantity.
> CTO / Architect / whatever is more at risk by choosing Oracle in 2020.
This trope started reaching fever pitch during the first wave of OSS commercialization hype/fervor in 98. It wasn’t true then and I see no evidence it is any more true today.
(I am the CTO of Yugabyte) Your points are all completely valid. Just wanted to add my 2 cents.
With YugabyteDB specifically, we are more than just PostgreSQL wire-compatible, we "reuse" the upper half of PostgreSQL to support almost all PG features (examples: stored procedures, triggers, partial functions, etc). So the aim is to build something that has "almost all PG features" while being able to "run cloud-native - with HA, scale and geo-distribution" - our hope is that this allows YugabyteDB to really become a viable option instead of PostgreSQL when apps are being built for the cloud. Here is a blog post on the benefits we realized reusing PostgreSQL: https://blog.yugabyte.com/why-we-built-yugabytedb-by-reusing...
> This trope started reaching fever pitch during the first wave of OSS commercialization hype/fervor in 98. It wasn’t true then and I see no evidence it is any more true today.
It is not. And I know of several major banks leaving oracle "en masse" because of licensing nightmare. The major decision makers are now endangered by their choice of oracle as a database to consider. Oracle in a new project is now a firm NO.
Regarding HA, active/passive and failover is enough for 99% of the use cases.
For the rest you'd need citus or patroni, but it's totally manageable.
I'd be quite dismissive of an architect who suggests oracle if there are no extreme availability requirements.
I'd also prefer a galera or an innodb cluster for active/active architectures.
Let's face it: oracle database is dying an its niche is shrinking.
> oracle database is dying an its niche is shrinking.
This is true, but misleadingly not the whole truth.
On prem RDBMS is dying and its niche shrinking (at least in this part of the cycle)... having said that, Oracle cloud offerings are doing very well.
Amazon made a herculean effort, and made some headlines when they migrated most of the business off Oracle last year. That's wonderful, for Amazon. Most of Oracle's enterprise customers are not Amazon.
> I'd also prefer a galera or an innodb cluster
The thing is.. Oracle sells so much more than just an DB engine, and people are buying. Oracle RDBMS is not an inferior product to open source competitors, in most ways superior, and yet, it is really just a proverbial loss leader.
This is the point that's getting missed on this site. Oracle can charge a fortune for their products precisely because they can still deliver things that other RDMS's can't and they are still offer the highest performing DB if you're willing to pay for it. Most business don't require this level of performance, but many do.
Yes, if you're a startup with an MVP that recommends good deals on imported wine, sure you can, and really should, use a simpler open source RDBMS. But, if you're a large established company with billions in revenue and need a DB that delivers as close to perfect reliability as possible, then spending a small fortune on Oracle licensing and all the hassle that comes with it, is a pretty minor thing relative to the bigger picture. A critical failure even once in a blue moon will cost an order of magnitude more than your licensing fees.
There's a lot of obituary writing for Oracle, but think the real story is that Oracle is going to go from the dominant RDBMS provider in business world, to a niche player occupying the high end of the market. It might be where they want to end up, but it's not a terrible place to be, either.
I think the major feature from proprietary RDBMSes that you rarely see in OSS RDBMSes is workload management.
The ability to set CPU, RAM, disk and network quotas on queries or users is extremely useful and powerful. That capability by itself is pretty close to non-negotiable for any large line-of-business system.
I know there is something of an OSS RDBMS renaissance happening at the moment, but I haven't heard of any of them offering this, except for Greenplum ... which used to be proprietary.
Disclosure: I work for VMware, which sponsors Greenplum, so it makes sense for my awareness to be biased.
I have not been an Oracle fan in the past, especially because of their complicated (and expensive) licensing, but late last year we moved to their hosted autonomous database. The on demand pricing model makes it quite economical, and the performance is amazing.
However, the killer feature for me is that it has application Express (or APEX) included, which is a complete web application development framework, as well as Oracle restful data services (ORDS). With built-in application development and deployment, it is the only complete, full-stack data management platform I am aware of (enterprise level).
YRMV, but it has been incredible for us, both to support our data science initiatives, and for rapidly deploying applications. I couldn't imagine going back to anything else.
My last job was 100% based on ApEx. Which I do not bring up on my LinkedIn profile.
I refer to ApEx as "Access, for the web, for Oracle". For what it's good at, it's pretty good.
The biggest plus in my view is that it rewards careful schema design. Point it at a properly-normalised schema and you can get 80% of a useful CRUD interface for 20% of the effort.
But all things being equal I'll be happy to never use it again. It's hard to test, hard to version control, hard to safely extend (you usually wind up with buckets of PL/SQL below the surface).
That is somewhat of a concern, but less so now that they support web services. So, you would still need a basic oracle DB set up in order to run APEX, but you could just use OracleXE, which is a quite capable free version, and then connect to whatever you want.
Also, you see this argument a lot - "what if I want to switch databases"? I've seen more than my share of overly-complicated and highly non-performant code bases, "just in case we want to change our database at some point" (see the ORM messes out there.) Its a problem in theory, and in my experience, not in practice. Never in 25 years of IT work have I switched databases, so it's often a classic case of "prevention worse than the disease".
One case where this use to be an issue was for software vendors that used to sell applications requiring a database, and they had to be ready to work with whatever the customer had. In our day of cloud based apps, this is increasingly becoming less of an issue.
"What if I want to switch" is a valid question if your business is at the mercy of a single vendor with a history of abusive business practices towards its customers.
While I may have a selection bias, I have seen quite a lot of companies migrating their applications, successfully. Most for exactly this reason. And many, perhaps even more, that would very much like to migrate if it was less disruptive.
Yep - agreed. It's "pick your poison". No matter what tech stack you select, migration away from it is a may be a consideration. (Whether it is a JavaScript framework, PHP to python, C++ to Go, or one DB engine to another.)
If being able to easily switch technology stacks is important to your business success, then you need to optimize for that. That's not the case for me.
> YRMV, but it has been incredible for us, both to support our data science initiatives, and for rapidly deploying applications. I couldn't imagine going back to anything else.
Out of curiosity, are your data scientists actually happy with this? All of ours are using Python notebooks with Spark or RDS, and I think there'd be an armed revolt if we asked them to migrate to anything else.
So far, yes. APEX in this case is used for data visualization and dashboards, mostly using Plotly.js.
And, the oracle database is the data store for clean/structured data. They use python notebooks and all the other python based goodies for their work - the DB is just where they get their data. (And, sometimes it makes sense to do data processing with the DB). We could end up using Spark at some point - we're not precluded from that.
> However, the killer feature for me is that it has application Express (or APEX) included, which is a complete web application development framework, as well as Oracle restful data services (ORDS). With built-in application development and deployment, it is the only complete, full-stack data management platform I am aware of (enterprise level).
Yeah those architectures were nice in the 90s. We don't do that anymore, for a whole lot of reasons. Being locked in with such a nefarious vendor is such a big risk.
Yep, you could do that and I'm sure it will work well, just like you don't need dropbox because "you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem."1
Once we signed up with Oracle Autonomous DB, we immediately could start developing and deploying web apps and web services all within the platform, without installing or integrating any other tools, and I'm not aware of another enterprise level platform where that's possible "out of the box". So, re: the original comment, that's why we chose Oracle for a reason other than supporting legacy systems. It helps us go fast, performance is great, and the on-demand licensing gets us all that without breaking the bank. There are definitely other ways to achieve the same thing (as you suggest), but I'd rather not spend my time doing any of that, just like I'd rather not roll my own dropbox, trivial though it may be.
Can you point me to a good resource on this? I work 100% in the data layer right now, but the idea of being able to create an easy frontend (assuming I learn react) would be wonderful.
Yeah, there's not one solution. There are many. Many are open source. PostgREST is amazing. The rest is up to you, but there's tons of tools out there.
Oracle and SAP have products that touch niche areas of businesses. Oil company with complex shipping and receiving looking for accounting software? Global metal foundry who needs to track raw materials to finished goods and forecast everything? SAP and Oracle can sell your VP overly complex products for almost anything.
For the DB, corporate executives types feel much more comfortable choosing Oracle or IBM. It usually bites them in the ass down the road due to licensing or support costs.
This is certainly all true, and with Oracle, absolutely everything is negotiable. Nobody pays list. It still isn't cheap. If you are going the Oracle route, you might even consider hiring a consultant to do the buying, because relationships and knowing the Oracle way can make a huge difference.
Also, Oracle Database itself is more than just an RDBMS and has an enormous amount of features that have no analogs in Postgres or any other non-commercial system. Take a look Oracle's data warehousing components, like advanced analytical SQL, pattern matching, and the especially cool modeling: https://docs.oracle.com/database/121/DWHSG/sqlmodel.htm#DWHS...
> - A much better experience in Java and .NET drivers, including SQL custom data types.
I didn't have any better experience with Oracle drivers in Java. Most of the driver is a soup of hacks exploiting obscure features of both the VM and standard library (both the vm and jdk are "Oracle owned" so I guess I was expecting that), also the source code is not available, so debugging it's a hellish experience.
On the other hand the Postgres JDBC Driver is the most well written and documented driver that I ever saw in Java
Sure it works, and is well tested and mature, and so are most of the jdbc drivers, never had any problem with any jdbc driver, aside of digging to code to understand some not documented behaviours. I don't see why put the Java driver as a benefit point of Oracle, most of the developers problems and issues don't occur at the JDBC level. It might be true for .NET but I'm really supicious if the Oracle support is that better over others
I didn't say code quality, I said usage of obscure and internal hacks from the JVM and JDK
> - A much better experience in Java and .NET drivers, including SQL custom data types.
So I'll admit, I haven't used the Oracle Provider for .NET since 2017.
Oracle.DataAccess is not the friendliest library to work with and I've seen some weird issues with pooling in past use. Devart has a good provider, but somewhat limited in free features (still a better general experience than Oracle's provider, but you don't get all the custom bits unless you pay).
PostgreSQL on the other hand has a very nice ADO Provider in NpgSql. It can look a bit daunting with all the config options offered but overall I'd still say it's a better API experience than Oracle.DataAccess.
> - A much better developer experience for stored procedures, with proper packaging, compilation to native code, graphical debugger.
Oh I do miss Oracle Packages so so much. Yeah, they could be a bit annoying to deal with from a 'gobs of code in one file' standpoint, but it's -so- nice to just have PKG_CUSTOMER, PKG_LOCATION instead of having to scroll through all the individual stored procedures, having to guess whether people named things in a way you could find them...
Edit: For a long time, you could have added
- Arguments about how to store a boolean value
But thankfully Oracle finally took care of that in 12.
dbeaver and plugin_debugger. For distributed transactions, it all depends on why you need them in the first place but there are multiple solutions in postgres to handle them.
I STRONGLY dislike Oracle the company, and the Oracle DB is quite complex, however Oracle RAC (HA) beats the pants off anything else out there as far as performant, reliable HA. Oracle DB also has been the leader performance wise for complex queries and large datasets. All of that is very important in my niche (enterprise grade eCommerce).
> Oracle DB also has been the leader performance wise for complex queries and large datasets
SQL server has a column store type of storage, and major innovations like 'froid'. Oracle is not such a strong leader there.
Also, on a whole lot of workloads clickhouse is much superior.
Yes, and? If you're choosing a tool to deploy for that workload, why would you deploy Oracle instead of ClickHouse? The same question goes for any other analogous workload. Why use something that's second-best at 100 different jobs (Oracle), when you could just choose the best-in-class tool for the exact job you're doing each time?
Especially since, in the particular use-case we're talking about here (data warehousing), the whole paradigm and all the tooling is built around the expectation of ETL pipelines copying+transforming+"cubing" data around from OLTP (or data-lake) systems to OLAP systems. "Everything being part of one solution from one vendor" doesn't make one whit of difference in that case, since the whole architecture is expected to be built around having a one-way pipeline of mutually-opaque interoperating systems, so any two pipeline stages that can manage to speak to one-another at all can't really be any "more" well-integrated than that.
> Especially since, in the particular use-case we're talking about here (data warehousing)
I didn't read any context of data warehousing except for the ClickHouse comment.
When comparing CH to Oracle, there's at best a 10% overlap. Within that overlap, CH is pretty amazing in what it can offer. However, for the remaining 90% Oracle kicks the shit out of CH.
CH does not have to worry about being an OLTP database and everything that entails (transactions, MVCC etc.) That means CH gets to take a LOT of shortcuts to offer what it does.
It's mostly used because the people who hold the purse strings in large organisations don't have a clue about databases, but the salesmen for Oracle wear fancy suits, take them out for expensive dinners and treat them like chums so they get the deal.
It's a super convenient operation but it has one big drawback which might be unexpected: `if exists` first acquires the relevant lock then checks.
This means a "ALTER TABLE table_name DROP COLUMN IF EXISTS column_name" will first acquire an ACCESS EXCLUSIVE lock, then check if the column exist.
Since DDL is transactional the lock will not be released until the transaction is committed or rollbacked, therefore even if the column doesn't exist it will prevent all concurrent operations on the table.
I don't have a lot of experience with Oracle but I can tell you that Postgres optimizer is shit compared to Db2 zOS or Db2 LUW.
When I worked in a large bank we tried to migrate core system from Db2 zOS to Postgres and it went nowhere.
I was a in-house developer working with Postgres consultants and they were amazed by db2 performance in OLTP scenarios.
So if your organization is already spending cash on Oracle, Db2 or MSSQL, use them for superior performance. Migration off them is costly and risky process.
If your working at a startup there is absolutely no reason to choose anything but Postgres if you need relational.
I've never understood why RDBMSes don't offer a low-level protocol where, rather than a SQL statement, you can just send over the exact query-plan AST you want the DB to prepare in some binary encoding. Then you could pre-compile your hot OLTP queries and heavy OLAP reports offline against your existing DB schema, using the same sort of techniques that got Lucene its optimized Levenstein-automata JVM bytecode. You could even tweak the resulting plan, op by op, before letting it go to the DB—as if you were doing final ASM tweaks on a game in the 90s.
In fact, my company wants to spare the expenses on Oracle. This is the main reason for migrating.
We'll see how it works out.
You are not the first one to point out postgre's optimizer. Is it really that bad?
YMMV. That depends on your workload, how your data is structured and a million other things. I know a few shops with heavy usage that couldn't be happier.
It may require handholding if you are not happy with the plans it is generating. Also the quality of the query planner results will depend a lot on how up to date the statistics are.
It's not that it's bad, but Oracle/MSSQL have had the benefit of decades of corporate muscle, researchers, and Fortune 500 clients to help pave the way.
Some good points in article.
There are some things that may need more attention.
1) Update from another table is not safe as it should be (in case of multiple values, final value will be sort of random)
2) Schemas as namespace separators (grouping tables inside database)
3) Extern join syntax in Oracle is actually more vulnerable (in case of error in multicolumn syntax it fallbacks to normal join). So, it's not better or easier - it is just created before standard JOIN existed.
4) Crucial difference how buffer-vs-filesystem cache works
5) Miss of plan stability - no solution out of the box in standard installation
6) Batch operation (in)efficiency
7) Pros/cons in undo/rollback handling
[likely some more that can't think of right now]
For some reason I find a query that uses (+) way easier to read than verbose outer joins. Probably because it's near the field and you see immediately "hey, this can be null".
Yes, it makes the query harder to migrate to other DMBS and to collaborate with non-Oracle persons.
To make it terser, omit `outer` because it is redundant. Now it's up to whatever you find easier to type. Definitely `left` or `right` for me. (+) is a really awkward sequence on QWERTY, at least.
What he demonstrated was just a way of making additional tables based on existing ones. While inheritance works sort of like partitioning except the child tables can contain additional data. Selecting from the parent will display all data from the child.
Biggest difference for me is DDLs are transactional in Postgres, but not on Oracle.
That means migration scripts for software on Postgress can just have all DDLs (alter, create, drop, grant etc.) and DMLs (inserty, update, etc.) mixed in whatever order they need to be, and if any particular line of the migration script fails - the whole thing is rolled back as if nothing happened. And then you fix the problem and run migration again. Easy.
In comparison writing migration scripts on Oracle is a nightmare - DDLs aren't transactional (THEY COMMIT ON EACH LINE...), so you have to separate them from DMLs and ensure that only the scripts that haven't passed yet are re-run later. I've worked in 3 different companies that used oracle, and there were 3 different approaches to that problem, and all 3 of them sucked :)
In one company we had several big customers each with 1 production db, and software was written on separate branches for each customer, and helpdesk staff was dealing with migrations - programmers just asked helpdesk to add a column and worked on the test db for that customer. It was a lot of unnecessary work to port changes and bugfixes between branches, but at least we knew exactly what is on each db and could fix problems by ourselves. There was no migration to speak of, just manual changes on dbs and documenting them in svn (it was before git was popular).
In another company there was one development branch and several customers, and there were migration scripts written by all developers when they made changes, which were merged into development branch for db by 1 guy whose whole job was to merge these scripts and check if migration works. It slowed down development (because when you finished your task on local db you had to make a migration script(s) and send them to be verified. And even "that guy" sometimes made mistakes and then if you fetched db scripts in the morning you couldn't work until stuff was fixed (or you had to recreate oracle db from scratch which took several hours).
That was before docker BTW, now they probably use docker so that can be less of a problem.
In the third company we had one customer but with hundreds of installations, and we had one development branch with frequent releases. Developers maintained migration scripts between release, major and minor versions. There was no "that guy" - we had smoke tests instead, and it sometimes took more time to write that migration script(s) than to change the code.
So you want to add 3 columns to 3 tables and fill them? And it has to be done in order because of dependencies? Write no less than 6 migration scripts (alter table 1, update table 1, alter table 2, ...). Add them with proper names and some boilerplate to the migration scripts for minor versions (3.4.5 -> 3.4.6). But that's not all! We also have migration scripts for major versions (3.4.0 -> 3.5.0), so you also need to add them there. You have to check the migration separately because these scripts often use shortcuts to run faster. So your scripts might break despite working for minor version migration.
Then there's the scripts for release version migration (3.0.0->4.0.0). Add your scripts there as well, and test once again.
Oh, and testing these scripts on test data doesn't mean they will work - each installation of db changes slightly over time - people add stuff from ui. There are rules what they can change and what they cannot, but if you don't think about it you might break something with your migration scripts on production despite it working on test data.
When that happens you have to write migration fixes which need to detect that problem and fix it on data you don't have direct access to :)
It was a nightmare.
Meanwhile Postgress is just doing the right thing, write 1 migration script with everything in it, if it works it works, if not - it rollbacks. Nobody thinks twice about it.
> Biggest difference for me is DDLs are transactional in Postgres, but not on Oracle.
The answer I received more than once from Oracle evangelists regarding transactional DDL: it's useless and if you need it, you are not testing your scripts properly
Transactional DDL is, IMHO, the best way to write database tests! I've been really happy with database testing since I built and started using a library that runs DDL and uses savepoints without ever committing a transaction.
Thanks for your inputs.
Yes, you are totally right about the DDLs, you cannot mix it with DML.
Seeing all the comments, it looks like a good decision to migrate. :)
I wish antitrust legislation in the US was brought back or enforced. Forbidding direct comparisons is an anticompetitive practice if I've ever seen one.
Whenever I see comment making comparison between Oracle and Postgre, I cant help bug wonder why isn't it compared to Enterprise DB, which is sort of like the unofficially official Postgre for Enterprise products.
I would bet the next "Oracle" (as in, large IT company) of the decade 2020-2030 will be based on PostgreSQL. Can't see an obvious candidate, yet. Perhaps someone has seen some interesting companies heading in this direction?
How come nobody has implemented an Oracle compatibility mode for PostgreSQL?
Or in general, why don't databases support each others SQL dialect? It can't be that much work, at least if one is content with only supporting the majority of applications, and seems pretty essential for popularizing a specific database.
Looking at the article, supporting Oracle syntax seems trivial in all cases except for adding full MERGE support.
It's been around for years. Earlier on PostgreSQL did make some efforts of being recognizable to Oracle users... look at Oracle PL/SQL and PostgreSQL PL/pgSQL... very similar and I recall that similarity being intentional.
Also, there is the SQL standard. Rather than supporting all vendors' syntax and features, which can change on the whim of some competitor that probably doesn't have your best interests at heart, it's better to adhere to the standard if you want the broadest applicability. PostgreSQL does exactly that with few deviations from the standard, relative to the industry as a whole. At the end of the day it's really about goals and not every RDBMS has the same goals; with PostgreSQL standards compliance is a goal.
It was actually a pretty big setback. We were using PostGIS to support spatial queries (a key requirement), and Oracle Spatial was just not at the same level (both in performance and features). The development experience with Oracle was also awful. The licensing for Oracle was highly granular, down to the feature level. More than once I'd identify a feature that provided a solution to an issue through online research only to be prevented from using it due to the customer not having the requisite license for it.
And the support was useless. Oracle was so complex (by design) we resorted to contacting support a couple of times - they would send out an "engineer" who could turn any technical troubleshooting session into a sales presentation for some Oracle product or feature that would "solve" whatever the issue was.
I will never work on a project involving Oracle again (barring obscene amounts of money to assuage my frustration, of course).