Hacker News new | ask | show | jobs
by scarface74 1149 days ago
When I was choosing jobs in the real world[1], I would avoid any company that depends heavily on stored procedures for business logic and where things were ruled by “database developers” and all the developers were doing were calling stored procedures.

Stored procedures are harder to unit test, do automated rollbacks, version control, etc.

I’ve never seen a system that 5 years in anyone said “I’m so glad we decided to use stored procedures everywhere”

[1] I work in consulting now and have no allusions that I face the same issues that day to day developers do. I get to lead development “solutions” and move on.

3 comments

I would agree, but I think it's worth checking out one of Siver's postgres repos where I thought he tested things pretty well[0]. Also things like Postgres are often run as a "server" on local dev or CI but could easily be run more ephemerally[1]. That should help with testing various states and behavior of the database.

I still would actually follow what you said thought because often times places which rely on stored procedures are, as you say, ruled by developers with the database access to manage those procedures and from what I've seen often do not have anything close to a reproducible setup. Besides the fact that often it's some proprietary database which is much more difficult to run locally than sqlite3 or PostgreSQL.

Still I keep that separate from considering if views/stored procedures are actually a good solution. I think sometimes it really is. Honestly Sivers' experiments really convinced me of that.

[0]: https://github.com/sivers/store

[1]: https://jamey.thesharps.us/2019/05/29/per-project-postgres/

It's worse than that.

Your database will be the first major thing to cave under load.

The traditional tools most folks use (postgres/mysql/maria/etc) want to scale vertically, not horizontally (you can do it, but it's hard, and has lots of drawbacks).

Eventually - You will hit hardware limits. Providers only have instances that are so big. There is a ceiling on how much you can scale your DB without having to do major, major work.

I've seen this pattern play out now at 3 high growth companies I've worked at. The one that was using lots of stored procs and materialized views got hit the fastest and the hardest.

More generically, scaling stateful things is generally hard than scaling stateless things (because you have to figure out what to do with the state).

Even horizontally scaling systems you either have to redistribute the data on scale actions or keep the data in place and send new data to new infra (which may or may not be helpful)

When it happens it is time to actually use the big boys databases.
I’ve worked with the big boys and seen plenty of times where databases fell over no matter how much data you through at it because of locking issues and other database contention issues.

It’s much easier to horizontally scale a bunch of application servers than database servers.

That is why big boys databases do clusters.
“clusters” only help when you can deal with eventual consistency (which is often admittedly more than most businesses realize). Sure you can scale your read only instances.

BTW: I work in the consulting department of a little company I am sure you have heard of that knows something about dealing with “the big boys” at scale.

Great, me too, several of them actually.
MySQL doesn't have materialized views, but it has better replication (at least using Aurora in AWS) than the big boys databases.

Adding more replicas and changing the instance size scales much better than fine-tuning the big complexity of Oracle and its materialized views.

So, maybe it is not the tool, but how to use it what matters. And fuck Oracle if by big boys databases you mean Oracle.

> When it happens it is time to actually use the big boys databases.

For years Oracle had a very small limit of how many CPU's stored procs could use baked into the DBMS engine. IIRC, it was something like 4.

Maybe that's changed now, I don't know. I do know Oracle qualifies for most as being one of "the big boys databases."

It is a matter of database tooling, Oracle and SQL Server provide quite good IDEs, graphical debugging, merge tooling, testing infrastructuring.

I can tell you that at least a pharmaceutical is quite happy to have plenty of stored procedures into their Oracle databases, including making use of Apex, and it is at least several decades in production.

I’ve worked on a codebase that had 300K lines of pl/sql and I didn’t find it all that fun to work with. The problem is that you are working in a very constrained environment, starved of language features and libraries, and with a deployment model that is deeply tied to the data and therefore more cumbersome. The IDE’s helped but didn’t matter that much. Also, oracle has per-cpu licenses, so the more code runs in the database the more expensive it becomes.

I think the best design is an API in a language designed for building API’s, on top of a dumb database, with all access gated through the API.

If you're going that approach, I think a lot of databases have facilities for extending outside plain SQL. Afaik Oracle can bind to Java and Postgres has an extension interface

I do tend to agree with API-outside-the-DB pattern. I guess Oracle supports some form of QoS but I don't think MySQL and Postgres do. That means having multiple apps hit the same schema can cause starvation issues

Also not sure how hard it is to monitor user resource usage. That's more important for billing (even if it's "fake" money from 1 department to another)

And when you get ready to rollback, can you just revert all of your code and deploy? Can you just do a “git branch” and work on your own isolated code in your dev account?” How well does merging work when you do a pull request and have to merge your code and make sure your stored procedures are up to date?
Yes, because there is such thing as CI/CD pipelines, and test environments, for database code as well.
If there are ten developers with 10 feature branches are they going to be running 10 instances of Oracle with their own “branches” of the stored procedures?
Yes,install the RDMS into their computer just like any other development tool.

Yes, learn to use multiple schemas and instances, just like using multiple deployments into Apache, Tomcat, IIS,...

Or even better, join the hype, and use containers with volumes configured for each feature branch.

And then copy “enough” data on their computers to have a realistic test and make sure you have some type of script so sensitive data isn’t on their laptops and ensure schema changes are constantly propagated to each developers workstation. This isn’t even mentioning if you have a commercial database like Oracle where you have to deal with licenses.

And how do you merge changes and do rollbacks?

Not to mention the whole

GetCustomer_1

GetCustomer_2

GetCustomer_3

Where code is copy and pasted and modified slightly.