Hacker News new | ask | show | jobs
by ergo14 1949 days ago
We are running sqlalchemy in a giant environment - hundreds of millions of users. SQLAlchemy has been a blessing for us. Blame users that use flask-alchemy or whatever else is cool. This is exactly the same type of comment that blames ORM-s for ineffective SQL being used because people don't know how to write the queries in the first place.
5 comments

You put it better than I could. There are certainly naive examples of how to use it out in the wild but that's not SQLAlchemy's fault. It's a sophisticated tool that accurately reflects how a database works.

I caution people against using Django because the ORM makes so many weak, simplified assumptions about how a database works that don't bear out in practice. It's fine for a blog. It falls apart when you're working with Real Data.

I've used Django with "real data", as in accountancy software for multiple years of transactions across 200+ fairly large clients and it's absolutely fine.

Whereas I've used Sqlalchemy and getting it to do a basic join, turns out there's 3 different ways of doing it all with an insane amount of crap to find the right way of doing things. I actually left my old job in part due to Sqlalchemy being too painful to work with.

There is absolutely a lack of a decent SQLA tutorial for IRL code.

The doc is very formal, and if you want to make a website, a quick script or a data migration, you are left on your own to find the proper setup.

I use Django with "real data" as well and, generally, it's pretty good once you learn how to wield it. My biggest issues are lack of multi-column primary keys and inefficient queries when doing .count(), that often introduce unnecessary group by statements in the subquery. If either of those could be addressed, I'd enjoy it so much more.
Django is optimizing for the 99.99% of people rather than the 0.01% of people who manage billions of rows on a single table. I like SQLAchemy, but for most people the added cost of managing connections, transactions, have to decide whether to use the 'ORM' or not, manually having to 'commit' changes, eager-loading vs. lazy-loading, is a complete waste of time if all they want is a website that will be used by - at best - a couple hundred people concurrently.
What weak assumptions does it make?
Yeah our org should really move away from flask-sqlalchemy and be more declarative about what session changes should actually be made on.

Being more hands on with sqlalchemy is probably the right approach.

I passed on flask-sqlalchemy because it was not obvious to me how it worked together, and I did not see the value over just using Flask and SQLAlchemy with no special integration. Has worked now a couple of years in, will see how it is in a few years more.
Not having touched this stack in a while: what is best current practice for wiring together sqlalchemy and flask (including unit tests)?
Potentially don't wire them together? That is, isolate them in separate parts of the code.

I always found Flask-SQLAlchemy a strange approach because it ties opposite and somewhat orthogonal parts of the request lifecycle together. When requests-to-data-transactions is a 1-1 relationship it's a convenience, but once you step out of that I think it's better to do things by hand.

My main app has a simple layer structure of "request > business logic > data layer". All of the business logic functions start a SQLAlchemy transaction using a Python decorator that is ~10 lines of SQLAlchemy code.

Check out how things are wired with in this Pyramid cookie cutter.

https://github.com/Pylons/pyramid-cookiecutter-starter/blob/...

I think same approach should work well for flask - just tie the session to the request object on first access.

Side note: I think we met on IRC few years ago ;-)

There is a lot going on in that cookie cutter. The “transaction” library for transaction management, wired into Pyramid with pyramid_tm, wired into SQLAlchemy with zope.sqlalchemy. Works nicely for Pyramid, but I don’t know how that translates to Flask.
Think on it like a middleware that wraps a transaction. It's fine, till you start consuming other Io based services (elastic/solar/...) And you find that this blocks one connection per request..
Doesn't, I wanted to show how it can be wired to request.
> hundreds of millions of users

Just curious, but how many employees? How many specifically are responsible for writing code that interacts with SQLAlchemy and/or the objects it creates. While I don't doubt it is _possible_ to make SQLAlchemy work great, I think it's mandatory to overstaff with engineers/programmers in order to do so.

> the same type of comment that blames ORM-s for ineffective SQL being used because people don't know how to write the queries in the first place.

I don't think this blame is as unfair as you imply, even if if were the primary cause of ORM induced performance disasters. In my experience the main selling point of ORMs has always been the promise for people who fancy themselves programmers to leverage DB technology without having to sully themselves or their cherished OO model by learning the relational model, databases or SQL. The other, equally false promise has been that the ORM would somehow allow you to "abstract" over your concrete DB so you could easily swap out one for another. Both these ideas now seem utterly misguided to me (although many years ago I probably fell for the same nonsense). For all of SQL's warts, I find the relational model conceptually much superior to OO. Moreover since the way data moves in and out of the DB (and the consistency/transactional and performance constraints around this) is often more architecturally central than the python code that orchestrates it, what ORMs are trying to do, shoe-horning the well thought out relational model into the badly thought out OO model seems utterly backwards to me -- spanning the cart before the horse so to speak. Now I will readily admit that SQLAlechemy is about the least bad ORM I have encountered, but I still absolutely fail to see what the point of it's ORM part is (I can see a bit of the appeal of being able to dynamically construct or compose queries with core, although I'd try hard to avoid the necessity).

Clearly you know what you are doing with hundreds of millions of users and you are not the sole large user of SQLAlchemy either, so I am evidently missing something. Can you maybe give one or two examples of things that are much easier/robust/... because of SQLAlchemy, compared to just writing SQL, in separate files, and executing that from you python code?

To be clear, there are various things that I think are broken about how databases work, and I'd really like to see them fixed but I don't see how SQLAlchemy helps much with most of the below (with the notable exception of query parameterization).

- no high level support for migrations: it should be possible to get a readable aggregate schema defintion out for example and it isn't

- parameterization and composition of queries is awkward and limited

- (for most DBs) lack of in-process support

- bad testing framework support

> Blame users that use...

I am, and I'm warning them, whats wrong with that?