Distributed Locks with Redis (2014)

zinodaur · on Aug 22, 2024

Martin Kleppmann has some interesting thoughts on Redlock:

> I think the Redlock algorithm is a poor choice because it is “neither fish nor fowl”: it is unnecessarily heavyweight and expensive for efficiency-optimization locks, but it is not sufficiently safe for situations in which correctness depends on the lock.

https://martin.kleppmann.com/2016/02/08/how-to-do-distribute...

cjk · on Aug 22, 2024

Salvatore Sanfilippo (author of Redis and Redlock) wrote a response to Martin Kleppmann's analysis that is worth a read (though it is a bit dense and hard to follow at times): http://antirez.com/news/101

I think I agree with Kleppmann's analysis, though.

dang · on Aug 22, 2024

Discussed at the time:

Is Redlock Safe? Reply to Redlock Analysis - https://news.ycombinator.com/item?id=11065933 - Feb 2016 (135 comments)

zinodaur · on Aug 22, 2024

> The algorithm's goal was to move away people that were using a single Redis instance, or a master-slave setup with failover, in order to implement distributed locks, to something much more reliable and safe, but having a very low complexity and good performance.

I think this is good perspective. More reliable + more safe + good performance - Fine, its not perfect, but I bet if you are currently using a single node redis lock and keep running into problems when it goes down, these improvements sound nice.

Some of antirez's comments surprise me a bit though

> A distributed lock without an auto release mechanism, where the lock owner will hold it indefinitely, is basically useless.

I have found durable locks very practical and useful

hinkley · on Aug 22, 2024

Durable locks have a partitioning problem. If the lock holder gets hit by a tornado or catches on fire then there is no recovery method short of manual intervention.

I took a formal class on distributed systems back when dinosaurs roamed the earth and the implementation of Ethernet was still considered interesting. And even back then we talked about leases for locks.

zinodaur · on Aug 22, 2024

We have things we call "durable locks" (but it sounds like thats a loaded term that I don't know the meaning of) that work by recording lock holders in persistent storage + use a corresponding volatile lock when the lock holders need to assert ownership (e.g. to perform a write).

in our system, the only programs that are allowed to take "durable locks" are ones that are guaranteed to complete (ie, their existence is also recorded in persistent storage, and they are retried until completion). The "durable" part means that even if they restart or die, other writers cant jump in and screw things up. The "volatile" part guarantees that only one of them will be writing at the same time.

I wonder what Martin would have to say about our weird little locks

kijin · on Aug 22, 2024

The thing about most modern web backends is that virtually nothing is guaranteed to complete.

The processes that use locks are often short-lived. They live in short-lived containers with no state, or maybe they're just lambdas executing under a strict resource limit. Either way, there's nobody to clean up after them or restart them once they're killed. When they begin a database transaction and then disappear for any reason, the best practice is to roll back and pretend they never did anything.

In this brave new world of YOLO lock holders, antirez's position makes a lot of sense. There's definitely still room for old-fashioned durable locks, but these are different use cases.

hinkley · on Aug 22, 2024

Autoscaling might mean there isn’t even a machine that corresponds to that dead server for days, weeks, or months.

What GP said sounds like it has leases of a fashion. Maybe not the jargon I’d choose to describe them but the industry is full of misleading names for things.

pookybear223 · on Aug 22, 2024

time have changed though. there are better implementations of these, and many companies have built successful startups based around these ideas. look around, it’s the age of distributed locking!

wesselbindt · on Aug 22, 2024

I think this is a good read, but not so much a rebuttal. He never really addresses the following scenario:

1. Get the current time.

2. … All the steps needed to acquire the lock …

3. Get the current time, again.

4. Check if we are already out of time, or if we acquired the lock fast enough.

4.5. client pauses for whatever reason (the example Kleppmann gives is a GC pause), long enough for the lock to expire

4.75. another client acquires the lock

5. Two clients simultaneously hold the lock

Which is the core of Kleppmann's argument against Redlock's correctness. I think the conclusion Sanfilippo can arrive at is that the algorithm is safer than the single node locking algorithm.

pram · on Aug 22, 2024

I've personally used Redis locks in my personal projects. The real motivation isn't because it's good or correct but because: Redis is already there in my env.

gabetax · on Aug 22, 2024

For basic SETNX single instance redis instances, sure.

But for the Redlock algorithm, I've never encountered anyone that was already running 5 redis masters to use out of convenience.

You're much more likely to have an etcd, consul, Zookeeper, etc cluster that you could use for coarse-grained distributed locking.

pookybear223 · on Aug 22, 2024

i think this type of ‘just tacking on’ to projects, since it is so low friction to do so, is part of how we got to where we are today

HideousKojima · on Aug 22, 2024

"Those who don't understand ACID are doomed to reimplement it... poorly"

wesselbindt · on Aug 22, 2024

This is a very nice read, as Kleppmann always is. Thanks for the recommendation!

Axsuul · on Aug 22, 2024

What's a good alternative?

eurleif · on Aug 22, 2024

I helped! :) (A little.)

http://antirez.com/news/77

>The Hacker News user eurleif noticed how it is possible to reacquire the lock as a strategy if the client notices it is taking too much time in order to complete the operation. This can be done by just extending an existing lock, sending a script that extends the expire of the value stored at the key is the expected one. If there are no new partitions, and we try to extend the lock enough in advance so that the keys will not expire, there is the guarantee that the lock will be extended.

alexey-salmin · on Aug 22, 2024

Something I don't enjoy about remote/distributed locks is that unlike distributed transactions they're usually unable to provide any strict guarantees about things they protect.

E.g. if you algorithm is:

1) Hold the distributed lock

2) Do the thing

3) Release the lock

And the node goes dark for a while between steps 1 and 2 (e.g. 100% CPU load), by the time it reaches 2 the lock may have already expired and another node is holding it, resulting in a race. Adding steps like "1.1 double/triple check the lock is still held" obviously doesn't help because the node can go dark right after these and resume operation at 2. The probability of these is not too high, but still: no guarantees. Furthermore at a certain scale you do actually start seeing rogue nodes deemed dead hours ago suddenly coming back to life and doing unpleasant things.

The rule of thumb usually is "keep locks within the same transaction space as the thing they protect", and often you don't even needs locks in that case, just transactions can be enough by themselves. If you're trying to protect something that inherently un-transactional then, well, good luck because these efforts are always probabilistic in nature.

A good use-case for a remote lock would be when it's not actually used to guarantee consistency or avoid races, but merely tries to prevent duplicate calculations for cost/performance considerations. For all other cases I outright recommend avoiding them.

imor80 · on Aug 22, 2024

A lot of what you say is explained in detail in Martin Kleppmann's article[0]. As you said, there's no guarantee about when the lock will expire. The proper solution for this is a fencing token. The idea is similar to how people have used optimistic locking when updating data in a db to avoid two users overwriting other's work.

[0]: https://martin.kleppmann.com/2016/02/08/how-to-do-distribute...

rand_r · on Aug 22, 2024

Yes, exactly! We found out the hard way just how unreliable Redis-based locks are, and switched to Postgres locks. It works reliably since our code is already in a Postgres transaction.

Created a “lock” table with a single string key column, so you can “select key for update” on an arbitrary string key (similar UX to redis lock). I looked at advisory locks, but they don’t work when the lock key needs to be dynamically generated.

quectophoton · on Aug 22, 2024

After reading the current[1] top comment about Redlock, this was literally the next low-effort thing that came to mind, so I'm glad to find some else's experiences with using a PostgreSQL table as a lock.

I will need a distributed lock soon, but I've never used one before so I'm taking this chance to learn about them.

[1]: https://news.ycombinator.com/item?id=41315621

hinkley · on Aug 22, 2024

If it goes dark a microsecond after #3 you might have an ambiguous success. Transaction processed but you didn't get a confirmation.

A lot of robust systems end up implementing their own bespoke WAL semantics on top of the system of record. It's like we should have a formal solution for doing that by now.

qaq · on Aug 22, 2024

We do we have globally distributed ACID DBs like Spanner, CockroachDB, FoundationDB etc.

alexey-salmin · on Aug 22, 2024

True. Even simple scenarios like "save a file in s3 IFF the s3 link is saved in postgres" which are seen in virtually any application are rarely handled well.

physicles · on Aug 22, 2024

Uggh, I was cornered into writing a couple of these over the years. The way I handled it was:

1. make sure both operations will be retried if they don't run to completion, and

2. think through how the rest of the system would react to one of them being present without the other

Then I use whichever of the two orderings is less bad from the perspective of #2. Obviously this depends on the exact use case -- I was simply lucky that the rest of the system was designed in such a way that it could tolerate that bad intermediate state.

jetru · on Aug 22, 2024

Famously broken badly for anything mission critical.

Bogdanp · on Aug 22, 2024

To counter some of the hate in this thread: I have used this to great success as an "opportunistic" locking mechanism to, for example, reduce load on a Postgres database. The winner of the race to acquire the lock would run the (expensive) query then cache the result. On lock release, the nodes waiting on the lock would then try to read the cache before trying to acquire the lock again.

awinter-py · on Aug 22, 2024

redis is the easiest-to-host lock server and that's worth the risk in some applications (depending on consequence of errors obv)

inspiring + slightly terrifying that rather than a single server-side implementation, every client is responsible for its own implementation

if postgres provided fast kv cache and a lock primitive it would own

skyde · on Aug 22, 2024

Redis is a very bad store for a distributed lock but Postgres is only slightly better.

What you truly need is something like ZooKeeper and etcd that are designed to achieve distributed consensus using algorithms like Paxos or Raft.

This ensures strong consistency and reliability in a distributed system, making them ideal for tasks like leader election, configuration management, and lease management where consistency across nodes is critical.

skyde · on Aug 22, 2024

Paxos and Raft are consensus algorithms that provide certain guarantees and capabilities that a master-slave system with synchronous replication, such as PostgreSQL, cannot offer.

These algorithms ensure that a majority of nodes (a quorum) must agree on any proposed chAnge. This agreement guarantees that once a decision is made (e.g., to commit a transaction), it is final and consistent across all nodes. This strong consistency is critical in distributed systems to avoid split-brain scenarios.

This is easily caused by :

1-network partition

2-latency issues.

3-Async failover (2 nodes think they are the master)

4-replica lag (some but not all replica acknowledged the write) while master send confirmation to client

silverwind · on Aug 22, 2024

Redis can achieve the same with `redis-sentinel` and `min-replicas-to-write`.

skyde · on Aug 22, 2024

Redis Sentinel provides high availability and monitoring for Redis, but it does not guarantee strong consistency.

Linearizability requires that once a write is acknowledged, all subsequent reads should reflect that write.

if min-replicas-to-write is set to the number of Redis replica then if a single node goes down you won't be able to do any write (take lock or release lock).

if min-replicas-to-write is set to any number smaller than the total number or Redis replica some replica could still be lagging because of Asynchronous replication.

Also when a replica acknowledges a write in Redis, it means that the write has been received and logged by the replica, but it doesn’t necessarily mean that the write has been fully processed and applied to the data set.

This mean reading from replica that acknowledges a write from master might still return the Old value for the Key.

awinter-py · on Aug 23, 2024

a single redis instance has better consistency + partition tolerance than any raft implementation

AtlasBarfed · on Aug 22, 2024

Where's the Jepsen suite tests?

Without it this is alphaware at best

Groxx · on Aug 22, 2024

It's an out-of-band locking mechanism. In particular, one which mentions "you should consider fencing tokens" merely in passing instead of baking them in from the start.

The only reasonable correctness-oriented view of that is "lol". It's not worth throwing a Jepsen-like test at it, the fundamentals aren't even slightly sound, merely "usually good enough". Whether that's worth it for [use X] depends on that use - often yes!

hinkley · on Aug 22, 2024

It's like a 50's ad slogan. If it doesn't _____, then I won't _____.

anonzzzies · on Aug 22, 2024

I have seen (ad hoc) implementations go quite bad many times. I encounter them quite a bit as a distributed replacement for some type of db transaction where the db is something like rds; someone thought to be smart and write 'things at scale' (they read on reddit etc) while a db transaction would've been the correct solution and they didn't need scale anyway (or underestimated the current db capabilities for mysql/postgres).