|
|
|
|
|
by jmuguy
880 days ago
|
|
Rails isn't super opinionated about database writes, its mostly left up to developers to discover that for relational DBs you do not want to be doing a bunch of small writes all at once. That said it specifically has tools to address this that started appearing a few years ago https://github.com/rails/rails/pull/35077 The way my team handles it is to stick Kafka in between whats generating the records (for us, a bunch of web scraping workers) and and a consumer that pulls off the Kafka queue and runs an insert when its internal buffer reaches around 50k rows. Rails is also looking to add some more direct background type work with https://github.com/basecamp/solid_queue but this is still very new - most larger Rails shops are going to be running a second system and a gem called Sidekiq that pulls jobs out of Redis. In terms of read queries, again I think that comes down to the individual team realizing (hopefully very early in their careers) that's something that needs to be considered. Rails isn't going to stop you from doing N+1 type mistakes or hammering your DB with 30 separate queries for each page load. But it has plenty of tools and documentation on how to do that better. |
|
The new queue you linked is database backed, but the whole point is that you want to just run a job without needing to serialize anything outside of your process. It should just schedule it onto the thread pool and give you a promise for when it's done.
The Kafka thing also seems to be an example of what I mean: in Scala I'd just make a `new Queue` with a thread safe library, and have a worker pull off and do an insert every hundred rows or so, or after e.g. 5 ms have passed, whichever is first. No extra infrastructure needed, minimal RAM used, your queueing delay is in the single digit ms, and you get the scaling benefits. Takes maybe 10-20 lines of code.
You can then take that and abstract it into a repository pattern so that you could have an ORM that does batching for you with single item interfaces (for non-transactional workflows), but none of them seem to do this.