Hacker News new | ask | show | jobs
by zzzeek 5024 days ago
I've read through all your articles regarding "object relational modelling" and am still having a problem with the notion of, "in order to do a complex relational query, we need code in the database". Stonebraker isn't entirely impartial here as he's trying to sell his own product in this area (VoltDB) which is highly dependent on the "database-side logic" approach.

There's an important tradeoff being discussed here, which is, "can we get directly the data we want from the query", versus, "do we need to load all the data into our app first and filter it there". This is of course the critical thing that a lot more people need to learn, and the work I do with SQLAlchemy is all about this. But in the SQLA approach, we use Python constructs on the app side which expand into SQL functions when rendered in a query. The effect is very similar to that which I see in most of the examples in your posts.

While I think advanced data models and rich SQL-side functionality are essential, the usage of stored procedures is IMHO not the only way to get there. In practice I often use a mix of both, depending on how verbose the function needs to be.

Keeping SQL functions as app-side constructs has the advantage of source code management. It's easier to support multiple kinds of backends (I run against PG and SQL Server a lot) since you aren't tied to a stored procedure language. There's no need to emit new stored procedure definitions to the database in order to support new features of the application. You don't have the issue of updating a stored procedure on the database side such that multiple application versions, targeted to different versions of the database function, still continue to function. I think there are ways to approach these problems in favor of SPs, but they require some thought on how the source code is maintained, managed, and deployed. For now I've just stuck with keeping most SQL functions on the app side.

The big namespacing problems I see are, what if two different kinds of "classes" want to have the same method name ? The definition of a PG function here creates a name that's global to the whole schema - this suggests we may want names that are qualified with a "class name". And what if you do in fact need two versions of the same function present to support different application versions ? In that case maybe we want to qualify the names of the functions with version ids as well. This actually sets up a great opportunity to use an application side system of rendering class/version qualified SQL names in response to plain names on the app side.

I guess my point is that the "app logic in stored procedures" approach is interesting, it has some management/deployment issues that also might be interesting to solve, but app-rendered SQL when using an effective enough app-side toolkit can solve the problem just as well in most cases.

1 comments

Stonebraker isn't entirely impartial here as he's trying to sell his own product in this area (VoltDB) which is highly dependent on the "database-side logic" approach.

Well, the quote is old, and the db he was trying to sell at the time was Informix, but I suppose that's a fair bit of truth to that. It is worth noting however, that he suggests in that paper that RDBMS and ORDBMS engines operate in different markets.

There's an important tradeoff being discussed here, which is, "can we get directly the data we want from the query", versus, "do we need to load all the data into our app first and filter it there". This is of course the critical thing that a lot more people need to learn, and the work I do with SQLAlchemy is all about this. But in the SQLA approach, we use Python constructs on the app side which expand into SQL functions when rendered in a query. The effect is very similar to that which I see in most of the examples in your posts.

The only reason we do what we do in Postgres is because we want to support multiple programming languages with minimal work. It is a matter of having this be an API accessible to multiple tools where some may be written in Perl, some in Python, some in Perl, and some in Java. If you are just writing a single app and don't want that portability, yeah, it is the wrong approach.

Keeping SQL functions as app-side constructs has the advantage of source code management. It's easier to support multiple kinds of backends (I run against PG and SQL Server a lot) since you aren't tied to a stored procedure language.

Right. There's a huge tradeoff here between "one database with logic centralized for many apps" and "one app that runs on many databases." I am not convinced you can do both gracefully.

The big namespacing problems I see are, what if two different kinds of "classes" want to have the same method name ?

Yeah, we struggled with that, which is one reason why we are using input types to construct classes. Function overloading then solves the problem.

save(asset_item) and save(journal_entry) then both work and can be discovered as needed from the system catalogs.

I am not saying this is the right approach always. I am saying it is an approach which trades away the ideal of "one app on multiple databases" for the ideal of "one database for many apps."

Choose the right tool based on what you are doing.