The Slotted Counter Pattern

Y	Hacker News new \| ask \| show \| jobs

	The Slotted Counter Pattern (planetscale.com)
	92 points by hollylawly 1459 days ago

11 comments

tantalor 1459 days ago

Basically same as "Sharding counters" (2008)

https://download.huihoo.com/google/gdgdevkit/DVD1/developers...

Also in Brett Slatkin's "Building Scalable Web Apps with App Engine" (2008)

https://youtu.be/Oh9_t5W6MTE?t=1181

link

ris 1459 days ago

Was going to post this. "Sharding" seems like a better term for communicating the idea.

link

zimpenfish 1459 days ago

I dunno - "sharding" would generally imply (to me, at least) that you're spreading the counters amongst different databases (or tables in a smaller context.) All the counters are in the same table here which is counterintuitive to "sharding".

link

tantalor 1459 days ago

Semantics

link

iv42 1459 days ago

Also, it's basically identical to the pattern described as Counter Tables in "High Performance MySQL" (3rd edition), published in 2012.

link

tantalor 1459 days ago

Indeed, it also calls them "slots":

> You can get higher concurrency by keeping more than one row and updating a random row... just choose a random slot and update it

https://www.oreilly.com/library/view/high-performance-mysql/...

link

_vvhw 1459 days ago

Balance tracking seems to be the Achilles' heel of SQL databases.

For example, my experience has been that in the world of payments, the nature of money transactions is to debit perhaps many different accounts on the one side, but credit only a handful of accounts on the other, so that group commit can't fully amortize fsync.

Performance panaceas mostly come down to sharding, but sharding doesn't work well where you need strict updates to balances.

At work, we saw this play out several times in different systems, and decided to do something about it. We took the ledger of an open-source payments switch called Mojaloop, and extracted it as a distributed financial accounting database called TigerBeetle, designed to track financial transactions at scale.

The key performance insight was to dial up group commit. We batch up balance updates so that a single DB query can do on the order of 10k balance updates. We then fsync the batch with a single write before commit, moving the performance needle out from 1k-10k TPS to 1m TPS.

This is the advantage of a purpose-built database that's designed for counting at scale.

More information, including our design decisions are in the repo here: https://github.com/coilhq/tigerbeetle

link

elsurudo 1459 days ago

Makes sense, but I can't help but feel it's a solution at the wrong abstraction level. It's a shame the DB can't figure this out for you.

link

yccs27 1459 days ago

Yeah, it seems like it would be possible for the DB engine to aggregate all these increments into one update. If you have two increments by one each in the queue, why not make it a single increment by two? I'm not sure though how much computing power it would need to figure that out...

link

LesZedCB 1459 days ago

wouldn't that break the atomic and isolated rule of ACID?

link

samatman 1459 days ago

Not necessarily. If both updates are in a single transaction then it's valid for the query planner to batch them, although that seems unlikely in the use case this table layout is designed for.

link

MauranKilom 1459 days ago

Not any more than the slotted counter pattern...

link

LesZedCB 1459 days ago

how does the shape of data impact the storage applications safety contract?

link

cryptonector 1459 days ago

The DB could figure it out IF it had monoid counter type to use as the column's type.

link

yccs27 1459 days ago

Interesting. This basically implements a parallel algorithm for summation/counting, by calculating a partial sum in each slot, and then merging the results with every read. This approach could be applied more generally to values in a commutative monoid, and there are probably other parallel algorithms that could be implemented in a database a similar way.

link

ukd1 1459 days ago

Cool, but this method will still get locks in whatever percent of cases, regardless of if there are slots not in transactions. In Postgres you can probably do this with slots using SKIP LOCKED; though in practice I belive you have to deal with the case where everything is locked, by falling back to waiting for a lock.

UPDATE counters SET count = count + 1 WHERE name = ? AND slot = (SELECT slot FROM counters FOR UPDATE SKIP LOCKED LIMIT 1) LIMIT 1

link

robocat 1459 days ago

How does that avoid race conditions?

SKIP LOCKED doesn’t seem to be designed for that purpose: https://www.enterprisedb.com/blog/what-skip-locked-postgresq...

link

ukd1 1456 days ago

Yep, it's meant for queues - a kinda popular one I maintain for ruby (QueueClassic) uses it. However, you can use it outside of that - this is basically a queue to update a column; note the caveat in my original post. SKIP LOCKED will I believe not find anything and just be fine with that if all are locked; you can deal with this, but I didn't in my example SQL.

link

deepsun 1458 days ago

In SQL `count = count + 1` does not have race conditions. The discussion about contention (bad performance), not about race conditions (incorrect results).

link

hiptobecubic 1459 days ago

I would definitely complain about this in code review without some performance testing showing that 1) it's actually faster than waiting for a lock and 2) it's better than just increasing the number of slots.

link

ukd1 1456 days ago

I'd be doing the same in the case of the post; it's guarneteed to hit locks some percentage of time. Should be better than a single slot though.

The whole point of the article was that idle-in-transaction due to locks on a counter in another transaction cause bottlenecks; skipping them using SKIPLOCKED with enough slots eliminates this. Randomly selecting them also randomly picks the locked one, causing a wait.

link

wcarss 1459 days ago

At the extreme end of this, you could also append a new row for every event and count them. If the number of rows would be too big over some period of time, you could similarly aggregate them occasionally and clear the "scratch" table.

link

teej 1459 days ago

This "Slotted Counter" approach optimizes for a write-contention constraint, specifically row locks. From 10 seconds of Googling it seems InnoDB has other locks it uses on INSERT, I'd first check if moving the write contention to gap locks actually helps or not.

One side benefit of this approach is that getting the final aggregate is cheap, where compacting an append-only log table might not be.

link

wcarss 1458 days ago

Thanks for taking the time to look into it, that's an interesting point. According to the manual [1], InnoDB's insert locking should not prevent other inserts from executing, it only takes an exclusive row lock on the inserted row. I agree that measuring would likely be smart.

This makes some intuitive sense, though: general purpose databases are expected to be _pretty good_ at handling the case of "add new data" with no other specific conditions, e.g. on other rows or tables' existing data.

I also agree with your last point. Running count() all day on this wouldn't be great, and compaction would take real time. I assumed that most high throughput write scenarios for something like an event count or view count can be a few minutes (or hours, or days) out of date, at which point read caching would be my first stop before clever summation algorithms, which are still pretty cool.

1 - https://dev.mysql.com/doc/refman/8.0/en/insert.html

link

idk1 1459 days ago

I accidentally ended up at a similar solution once.

I has the same issue, and I fixed it by adding an associated 'count_table' row for each hit, and deleting the row once it had been added later on to the final count. Which actually fixed the issue. Then refactored it so each user or ip had it's own 'count_table' row. It meant the final total count lagged a bit, by 60 seconds or so once the count_table rows had been counted up and deleted, that was the downside but it was totally acceptable.

I wish I'd have thought of this, it's much better and simpler I think haha.

link

asadawadia 1459 days ago

the fundamental problem with counters is that it does a read-modify-write cycle which is quite harsh on the DB - a better approach is to take advantage that counters are cumulative and we can keep the delta events only and 'merge' them on reads

or just buffer the counters in redis and then flush them out

I have a counters API that does precisely this

Docs are here: cmd+f: 'Counters API now live'

https://blog.aawadia.dev/api/

link

nyanpasu64 1459 days ago

This feels like the database locking version of https://travisdowns.github.io/blog/2020/07/06/concurrency-co..., which instead covers in-process concurrency speed.

link

dima_vm 1458 days ago

If we need slotted counters, we're pretty close to the boundary of "right tool for the job", and we need to consider using a time-series database.

For one use-case it might be ok, although if we expect more counters in the future, it's easier to make it now than refactor later.

link

sonicgear1 1459 days ago

Wouldn't querying the count be slow using a WHERE clause?

link

LesZedCB 1459 days ago

it was indexed which should help, but i assume that the reads are far less common than the inserts. reads could even be scheduled/automated and stored in a cache table if they need to be faster and ok being a little stale

link

Andys 1459 days ago

There's only dozens (or at most ~100) rows.

link