Adding Optimistic Locking to an API | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	Adding Optimistic Locking to an API (moderntreasury.com)
	74 points by qin 1759 days ago

8 comments

valenterry 1758 days ago

> Looking at our request traffic made choosing optimistic locking fairly easy. We expect the majority of ledger operations to be reads, and we didn't want reads to block writes (and vice versa).

I don't get it. If they use an SQL database that supports ACID already, why not just lock all the ledger rows necessary with an exclusive row access when writing and otherwise just with a shared access so that the write waits until the read finishes?

cormacrelf 1758 days ago

I think the missing context is that the db reads go all the way to an HTTP client, get modified on the client, and are sent back hoping nobody else has edited the row in the meantime. Pessimistic = check row is not locked on READ; Optimistic = check lock_version matches on WRITE.

For a table where single row ops are all you can do, this is basically enough to let API users read and update rows concurrently. SQL transactions don’t survive longer than a single batch of statements sent in one request, so inadequate.

Edit: Turns out ActiveRecord’s pessimistic locks don’t use a column at all, they use builtin row locking from MySQL/Postgres (ie ‘inadequate’). So you can’t use it for multi HTTP request transactions at all.

Final edit, if you read about how databases implement ACID in the first place, optimistic locking is one of the building blocks. You can use it to reproduce the various isolation levels of SQL transactions in HTTP APIs, manually, in SQL. (Also look at MVCC.) This does not sound fun. Distributed locking is kinda like doing a group project at university, in that you’re going to have to do it all yourself, but in another sense if one writer goes MIA you all fail.

valenterry 1758 days ago

Ahh, so they sent the lock_version with the read. Now it all makes sense. Thank you!

And as you say, in this case you couldn't even reliably lock on read because you don't know whether or when a client sends a POST anyways.

paulddraper 1758 days ago

They wanted to add locking to their API, not internally.

hcarvalhoalves 1758 days ago

> Since the protocol doesn't specify how the ETag should be generated, we could have passed in our "lock_version" version numbers. But because it seemed strange to only honor the ETag headers for a single resource in our API, we decided against it.

Odd choice. There's a standard, but the developers still chose to re-implement w/ specific semantics. There's nothing on the standard saying you have to support ETags for all the resources.

paulddraper 1758 days ago

Yep.

The HTTP standard is rich with a caching, idempotence, etc.

You really have to craft a set of requirements to not find what you need there.

traceroute66 1758 days ago

Optimistic locking in a stateless environment is actually not difficult to implement. The hardest part is actually making sure #devops are aware of the concept !

Tom Kyte, he of Oracle fame had a particularly good discussion of the concept in one of his books (Effective Oracle by Design ... IIRC).

IIRC, the Oracle way is to enable rowdependencies for the table(s) in question and then use ora_rowscn.

But in reality, you can use almost anything that changes in a defined fashion (timestamps, version numbers etc.). Then all you need to do is test for it in your database stored procedures (or elsewhere in your middleware if you are not using sprocs).

durbatuluk 1758 days ago

We had similar problem (banking service). Instead of "moving" directly to destination we include a safe account in the middle.

A > safe > B

Safe account is called this way because we don't risk the misuse of money in case of rollback.

cbetti 1758 days ago

Generally in banking you either separate preauthorization (or reservation of funds) from authorization, or you separate transaction from settlement and you compute an acceptable rate of loss.

Why did you decide to go with an escrow model in your use case?

mdellavo 1758 days ago

optimistic locking is a useful technique that seems unfamiliar to many - it's worth looking into

karmakaze 1758 days ago

I thought I didn't know what it was when I heard of them, then realized that it's not a lock at all, it's just a row version column, with the basic rule don't overwrite without having seen what you're about to overwrite.

Same goes for `git push --force`, always use `git push --force-with-lease` instead.

omegalulw 1758 days ago

Wait, how do you ensure that the lock version itself is safely written by two or more writers? Not to mention what happens if you read the version and in the time you decide to do a write someone else incremented the version.

marcos100 1758 days ago

Why do you think it is not a lock?

karmakaze 1758 days ago

It is a locking convention, achieving something locks are used for, but you can't point to one thing and say, there, that's the lock. Contrast with a database row lock where no other transaction can change that row or get a locked access to it until released.

e.g. if the version column is an incrementing number, then it relies on no client unilaterally incrementing the value on failure and retrying--not much of a 'lock'.

cormacrelf 1758 days ago

I agree. Optimistic locking is more akin to an atomic CAS operation than a lock. The usage of the API is the same; if the compare fails, fetch again and retry in a loop. (Incidentally, it doesn’t experience the ABA problem because nobody writes a lower lock_version unless they wait until the field overflows and wraps around… !).

Nobody would call CAS a lock. If your system only uses CAS, it should rightly be called “lock-free”.

karmakaze 1756 days ago

The point to call Optimistic Locking by its other name Optimistic Concurrency Control (OCC) is a great one and I'll be doing that from now on.

In the spirit of trying to keep this complex subject free of misrepresentation... Compare-and-set (CAS, or alternatively compare-and-swap) is commonly used to implement lock-free algorithms.

A 'spin lock' where a thread uses only CAS in a tight loop and not carrying on until the lock is acquired is indeed a lock and not a kind of non-blocking or lock-free algorithm.

speed_spread 1758 days ago

You can actually think of the version number in optimistic locking as a shared key combination. First party to use the key gets to pick the next combination, invalidating any other concurrent copies.

Performance and implementation characteristics differ but the end result is functionally the same as a pessimistic lock, which is why they're both called "locking" mechanisms.

cormacrelf 1758 days ago

The end result is functionally very different. If you were given a shared updatable value, and were told that the only API was non-blocking and you would have to retry in a loop until it succeeded, would you call it a lock? It’s straight up not a lock.

The functional difference is crucial here, the non-lock characteristics are what allows them to offer atomic updates via HTTP. They couldn’t have used pessimistic locking to do this, not only because it is is limited to a single db transaction, but because they can not trust API users or the network to ensure the rows ever get unlocked.

mirekrusin 1758 days ago

It's not a lock in the sense that that is succeeds or fails immediatelly.

It is a lock in the sense that it allows success only.

Pesimistic lock usually means "maybe wait then success". In complex locking spaghetti it may mean deadlock. It may also mean wait then timeout. It may also mean wait then timeout then I don't know what actually happened, maybe success, maybe not.

nesarkvechnep 1758 days ago

How could it be familiar to people as most of them hopped on the GraphQL hype train and treat HTTP as a dumb pipe.

TuringNYC 1758 days ago

I understand each solution has trade-offs, but I've traditionally done this with an in-db proc (we used stored procedures extensively on Wall Street for txns). How does this compare against having the entire transaction functionality inside the DB?

beachy 1758 days ago

Using stored procedures is old fashioned and won't earn you any points on your resume if you're being interviewed by 20 year olds. It's a solid solution though unless you are at FAANG volumes.

mirekrusin 1758 days ago

Tradeoff is that it has very poor performance for frequently changed data, you can starve clients etc.

It's much better to create transfer api which atomically debits one and credits other account instead of low level individual ops.

ram_rar 1758 days ago

if you're solely relying on "lock_version" numbers, you may wanna watch out for integer overflows (not sure how ruby handles it). I had faced a similar problem in the past, it can have massive ripple effects with downstream services.

Also, if you're using postgres. Its worth looking into advisory locks [1] for similar use cases. They are pretty light weight compared to other locking mechanisms.

[1] https://www.postgresql.org/docs/9.4/explicit-locking.html

ianpurton 1758 days ago

Modern Treasury is a Ruby on Rails shop

I really wouldn't want to write a service that deals with money in an unsafe language. It's so easy to make a mistake in ruby there's no compiler to help you.

quadrature 1758 days ago

It definitely a scary proposition. But it's worth mentioning that two of the world's biggest money moving platforms (Stripe and Shopify) are written in ruby.

valenterry 1758 days ago

Stripe decided to write a type checker for it though, probably for good reasons: https://sorbet.org/

judofyr 1758 days ago

It should be mentioned that their reasons were mainly developer convenience/productivity, and not “correctness”.

I think people severely overrate the value of the language when it comes to avoiding bugs. We already know how to minimize bugs: Extensive testing regimes (automated, manual, or both) and a general focus on correctness over “shipping on an artificial deadline”.

karmakaze 1758 days ago

You can have something that aids convenience/productivity and not correctness. You can also have something that aids correctness that increases developer convenience/productivity. I treat static typing as the latter.

Edit: Btw, the github.com/sorbet/sorbet project is really well run and surprisingly easy to get into. I've been drafting PRs. C++ is not nearly as rough as I remember from decades ago.

nradov 1758 days ago

Ironically, advocates of dynamic typing often claim that it improves developer convenience and productivity.

quadrature 1758 days ago

Yup, also being used at Shopify. Types are awesome, but these companies got very far by betting big on the productivity of ruby before this became a concern.

valenterry 1758 days ago

Not claiming that ruby isn't productive, but I could as well rephrase that into "these companies got very far despite the lack of productivity of ruby before this became a concern". It's not really giving any new insight.

mflamespin 1758 days ago

There's lots of Java development that deals with money movement that is suspect. All languages and frameworks are just tools. It depends on how you use them. Compilers are no guarantee for correctness if how you're modeling types is improper.

ianpurton 1758 days ago

Well to me it's defence in depth.

I've used Ruby in a cryptocurrency project and I've used Rust. I know there are no guarantees but I wouldn't go back to ruby. There are just too many times when the compiler catches something I missed.