Hacker News new | ask | show | jobs
by bluGill 79 days ago
Which is easy to say. I've been trying to debug an overloaded queue for over a week now. (it used to work until I discovered there were some serious race conditions resulting in 1 in a million problems crashes, and every fix for them so far has not fixed things. (at least I can detect it and I'm allowed to toss things from the queue - but the fact is we were handling this before I put the fixes in and people don't like it when I now reject thing from the queue so they want the performance back without the races)
2 comments

Is your queue bounded?

Does it reject entries when service times are too high?

Your debugging effort may become more predictable when the system measures the time workers take to complete.

I note you say it used to work overloaded. I would argue it probably was having hidden problems. Perhaps ask those people what the acceptable service time is and lock it in by refusing new entries when it is exceeded.

If they want both infinite queue length and consistently acceptable service times then you must add enough work resource to do that.

Queue is and was bounded, if it gets too large we already logged an error and stopped processing. it is currenty lock free, but the old version had locks (i've tried several versions with and without locks). the bounds didn't change but before it was processing in time even under heavy load, now it isn't.
I feel you may be adding your critical sections at too high of a layer (either in the code, or the data structure) if it is severely affecting performance. Look up sharded locks, and totally order them if you must acquire 2 or more at once.

You may also want to implement reader/writer locks if your load has many more reads than writes.

Unfortunately, nobody really teaches you these things in a really clear way, and plenty of engineers don't fully understand it either.

I didn't give near enough details for you to speculate like that. what you said applies to some very different queues but not mine. (What I have is currenly lock free, though this is my third redesign with different locking strategies for each.)