Hacker News new | ask | show | jobs
by sfilipov 2637 days ago
I don't disagree overall but I'm using SQLAlchemy (Python) on a daily basis and it works quite well.

I agree that if you are building ORM with the intention of hiding so many SQL details that the developers don't need to learn SQL, you'll run into problems.

SQLAlchemy on the other hand starts simple, but is flexible enough that you can do inner/outer joins, you have different loading strategies etc. It is quite nice if you are proficient enough in SQL.

When writing complicated select queries, I'd start by writing the raw SQL then spend my time representing it in SQLAlchemy. It's a bit less than double the work sometimes, but working with the SQLAlchemy objects in Python is quite nice as long as you pull them in an efficient manner. SQLAlchemy just feels like a well thought out ORM.

1 comments

Never heard of SQLAlchemy. The core part of the lib looks quite nice, the problem is its design is driven by the needs of its ORM.

If you define your database properly typed, defaulted, and not null unless an update field that sort of layer isn't needed at all.

Using the correct native types on your function params or class structures means you'll never run into type mismatch or cast failures loading or unloading from the db.

So I just don't get why the need for ORM. The only thing that needs mapping is the table schema to your data type object/structs.

ORMs don't offer anything of value their and often make things you need to do in your job harder.

I'm sick of the argument that those who advocate for the use of an ORM have a dislike of SQL, or simply don't understand it. Perhaps this is true of newer developers who have "grown up" on ORMs, but for the most part advocating for an ORM is simply wanting to employ the "don't repeat yourself" thing and having enough battle scars to know that trying to compose your own SQL statements results in vulnerabilities, bugs, and a mess.

> If you define your database properly typed, defaulted, and not null unless an update field that sort of layer isn't needed at all.

Which is a laudable goal but not realistic if you've worked on any backend that has even a hint of organic growth or legacy. I know of at least two books on the subject of dealing with anti patterns in and the refactoring of database schemas.

Database schemas are the mortgage of technical debt, they can take literally decades to pay off if you don't get them "right" in the first place, or you pivot, or you're growing quickly, or you have a large team. Using an ORM allows you to abstract away and isolate a lot of the technical debt, create single entry points, and then fix those.

And in the part where using the ORM's syntax would obfuscate or complicate compared to SQL, then just write your own SQL query anyway and put this in a virtual view.

Sure, you can create your own entry points that handle the madness and domain knowledge, with your own SQL handling to put the pieces together that return a nicely mapped object or data structure. Then what have you done? You've written your own ORM.

Vulnerabilities from writing SQL? ORMs do not solve this. Prepared statements and parameterised queries solve this.

Bugs? You're shitting me, from what? Untested queries?

A mess? Clean it up. You can make a mess with an ORM too.

> Which is a laudable goal but not realistic if you've worked on any backend that has even a hint of organic growth or legacy.

Except it is realistic and I have cleaned up a lot of "organic growth" and "legacy code" like this.

> I know of at least two books on the subject of dealing with anti patterns in and the refactoring of database schemas.

Yea.. where do you think I got these ideas?

> Database schemas are the mortgage of technical debt, they can take literally decades to pay off if you don't get them "right" in the first place, or you pivot, or you're growing quickly, or you have a large team.

Database design is not hard and your schemas/changes should be peer reviewed like any code. It's also not static, yes, some, SOME, very few situations are essentially unchangeable.

But most database design is easily extended without causing breaking change.

Breaking change is more problematic but not in any way more or less than any other breaking change.

Then again, I'm kind of curious how any of these problems are solved by an ORM.

> Database design is not hard ... I'm kind of curious how any of these problems are solved by an ORM

Database design is indeed not hard (for the vast majority of use cases), but we can't predict the future. I'll give you an example from our app.

We've recently had to update the schema to support multiple different tax amounts for transactions. This is several different tables in the old schema, which all now have an extra child table each, and a lookup table to store the tax rates, to write/read those tax amounts for the transactions (basically a 1 .. n table for each parent table). Pretty basic stuff.

Anyway, historically, i.e. when the app was written 15 years ago, it was in a single market with a single tax rate on transactions. The original developers didn't foresee that the platform would turn into a multi-billion turnover level system, that would expand into different countries which have their own complex tax rules. It made sense that you have an amount column and a tax_amount column. Now it doesn't make sense, because hell no i want this done right and we're not going to add more columns we're actually going to model and normalise this correctly and move the tax amount(s) into child tables.

Because the older code had all of the interaction with the database not so abstracted, poorly factored if you will, we had to update dozens (i forget how many) of queries to add all the new joins and of course it needs to be back/forwards compatible, and we have to roll this out piece by piece because (as i said above) this is a multi-billion turnover level app.

The existing queries are in the region of 50+ lines of SQL with a dozen or so joins in some case. That's not actually complex SQL, it's pretty mundane and trivial SQL - SELECT a few columns, JOINing a few other tables, WHERE stuff matches conditions. There's the occasional IF, CASE, COALESCE, statement in these queries as well, and the odd UNION here and there. Again, pretty simple stuff. There are a few nice surprises from it being an old schema, polymorphic relations for example.

The changes amounted to hundreds of lines of SQL additions, the equivalent in actual "getting stuff done" code additions, making sure this is all tested so there are no regressions in moving from the old tax column in the parent table to getting the tax amounts out of the new tables, and weeks of testing and logging to find any edge cases or bugs.

In the newer model code that wraps all this stuff in an ORM? A few lines of code, because we can just compose a role into the model classes giving them access to a tax_amounts method that returns the details of each transaction's multiple tax amounts + rates + other stuff. The ORM knows about the new tables, the queries run by it add in all the necessary joins for us. The model classes are correctly factored to compose the Taxes role.

Now you can argue that if the original code had been better factored then we wouldn't have been in the same situation, and it would have been easier, but we would still have had to update multiple queries. This is where the ORM shines and it's why I'm tired of writing trivial SQL having done so for twenty years against half a dozen different database engines. I really don't want to have to write SELECT a few columns, JOINing a few other tables, WHERE stuff matches conditions type queries again. Unless of course I'm sat in the database terminal client, which happens more times a day than I can COUNT.

Not really sure how ORM solves anything here.

That said, your old and new solutions both sound wrong to me.

Tax is a line item.

To me it's as much a line item as a product, service, shipping, handling, booking fee, or any other thing that gets summed to form the total order / transaction value.

You then get to ref optionally in the way that makes sense to map attribution correctly.

Tax belongs to an item? No problem. Two taxes, one item, no problem! One tax multiple items? No problem!

> Not really sure how ORM solves anything here. That said, your old and new solutions both sound wrong to me.

The new solution does what you talk about, I guess I didn't explain it very well.

The use of an ORM is abstracting away a lot of the technical debt in the schema and isolating it so we can concentrate on fixing that debt without the worry that it continues to spread. It's much looser coupling than continuing to litter the codebase with manually written SQL.

Edit: reading more of the replies to your original comment - there is btschaegg's comment that having a narrow scope is what you should aim for[1]. There's also a reply (first one) on stackoverflow that talks about not leaking the abstraction[2]. Finally the key to our use of an ORM is that the ORM is not our model, we do not leak the abstraction, and we can drop down to manual SQL when we need it but that is still in a narrow scope.

[1] https://news.ycombinator.com/item?id=19592042

[2] https://softwareengineering.stackexchange.com/questions/3045...