Hacker News new | ask | show | jobs
by goldenkey 4572 days ago
I actually have first hand experience from merely a small data set of 37,000 records of a couple fields each. I needed to run a batch update to calculate a property of each document. The property is dependent upon a count of other similar documents. Turns out that it was so unbearably slow, that there was no way to do the batch update without waiting hours.. I searched long and hard for a solution to the write/read locks but there is none. So this is my holy grail to share, and if you don't use Mongo..more power to you.

I ended up figuring out a workaround but it left a bad taste in my mouth. My workaround was to create a temporary collection and insert all the previous records, with their Category attribute updated, in a single .insert call.

I am more than sure that an SQL UPDATE query using a subselect for the COUNT(*) would have probably executed in less than a couple seconds. That's what's so sad about MongoDB. And I even had everything indexed, it made almost no difference. The write/read locks are _murder_.

    houses.db.eval(function(){
		const tmp = db[Math.random().toString(36).slice(2)];
		tmp.insert(
			db.houses.find().map(function(e){
				e.Block = db.houses.count({
					Street: e.Street,
					Sector: {$lte: e.Sector},
					Quadrant: 1
				});
				return e;
			})
		);
		tmp.renameCollection("houses", true);
	}, {nolock: true}, function(err){
		dbCallback(err);
		console.log("Done!");
		process.exit();
	});
Eh.
1 comments

that is an paragon of maintainability if I ever saw one.
I hope that is sarcasm :-)