Hacker News new | ask | show | jobs
by Perseids 2669 days ago
Since looking into modern concurrency concepts I've always thought such (in my opinion obvious) batching should be part of sophisticated ORM frameworks such as Rails' Active Records. Alas, their design decisions always seem to cater for making the dumb usages more performant (sometimes automagically, sometimes adding huge layers of cruft) than rewarding programmers who are willing to learn a few concepts by creating interfaces with strong contracts with better safety and performance.

E.g. please give me guidance on how to better structure my database model so that it doesn't effectively end up as a huge spaghetti heap of global variables. My personal horror: updating a single database field spurs 20 additional SQL queries creating several new rows in seemingly unrelated tables. Digging in I find this was due to an after_save hook in the database model which created an avalanche of other after_save/after_validation hooks to fire. The worst of it: Asking for how this has come to be I find out that each step of the way was an elegant solution to some code duplication in the controller, some forgotten edge case in the UI, some bug in the business logic. Basically ending up with extremely complex control flows is the default.

So of course, if your code has next to no isolation, batching up queries produces incalculable risks.

/rant, sorry.

1 comments

I agree that with that kind of complexity (or with the belief that that kind of complexity is inevitable) it isn't a great idea. You lose isolation, and if you can't predict which rows will be touched you're hosed.

One mitigating factor, this sort of optimisation should be applied to frequent queries more than expensive queries. In some use-cases the former kind may be simple ("Is this user logged in?") even if the latter is not.

And on keeping that complexity down: the traditional story has been "normalise until you only need to update data in one place," but often requirements don't line up well to foreign-key constraints etc. The newer story can work, though: "Denormalise until you only have to update in one place, shunt the complexity to user code, and serialise writes." It's anathema to many, but it is becoming more common (usually in places that don't use RBDMSs though.)