Martin Kleppmann has some interesting thoughts on Redlock:
> I think the Redlock algorithm is a poor choice because it is “neither fish nor fowl”: it is unnecessarily heavyweight and expensive for efficiency-optimization locks, but it is not sufficiently safe for situations in which correctness depends on the lock.
Salvatore Sanfilippo (author of Redis and Redlock) wrote a response to Martin Kleppmann's analysis that is worth a read (though it is a bit dense and hard to follow at times): http://antirez.com/news/101
I think I agree with Kleppmann's analysis, though.
> The algorithm's goal was to move away people that were using a single Redis instance, or a master-slave setup with failover, in order to implement distributed locks, to something much more reliable and safe, but having a very low complexity and good performance.
I think this is good perspective. More reliable + more safe + good performance - Fine, its not perfect, but I bet if you are currently using a single node redis lock and keep running into problems when it goes down, these improvements sound nice.
Some of antirez's comments surprise me a bit though
> A distributed lock without an auto release mechanism, where the lock owner will hold it indefinitely, is basically useless.
I have found durable locks very practical and useful
Durable locks have a partitioning problem. If the lock holder gets hit by a tornado or catches on fire then there is no recovery method short of manual intervention.
I took a formal class on distributed systems back when dinosaurs roamed the earth and the implementation of Ethernet was still considered interesting. And even back then we talked about leases for locks.
We have things we call "durable locks" (but it sounds like thats a loaded term that I don't know the meaning of) that work by recording lock holders in persistent storage + use a corresponding volatile lock when the lock holders need to assert ownership (e.g. to perform a write).
in our system, the only programs that are allowed to take "durable locks" are ones that are guaranteed to complete (ie, their existence is also recorded in persistent storage, and they are retried until completion). The "durable" part means that even if they restart or die, other writers cant jump in and screw things up. The "volatile" part guarantees that only one of them will be writing at the same time.
I wonder what Martin would have to say about our weird little locks
The thing about most modern web backends is that virtually nothing is guaranteed to complete.
The processes that use locks are often short-lived. They live in short-lived containers with no state, or maybe they're just lambdas executing under a strict resource limit. Either way, there's nobody to clean up after them or restart them once they're killed. When they begin a database transaction and then disappear for any reason, the best practice is to roll back and pretend they never did anything.
In this brave new world of YOLO lock holders, antirez's position makes a lot of sense. There's definitely still room for old-fashioned durable locks, but these are different use cases.
Autoscaling might mean there isn’t even a machine that corresponds to that dead server for days, weeks, or months.
What GP said sounds like it has leases of a fashion. Maybe not the jargon I’d choose to describe them but the industry is full of misleading names for things.
time have changed though. there are better implementations of these, and many companies have built successful startups based around these ideas. look around, it’s the age of distributed locking!
I think this is a good read, but not so much a rebuttal. He never really addresses the following scenario:
1. Get the current time.
2. … All the steps needed to acquire the lock …
3. Get the current time, again.
4. Check if we are already out of time, or if we acquired the lock fast enough.
4.5. client pauses for whatever reason (the example Kleppmann gives is a GC pause), long enough for the lock to expire
4.75. another client acquires the lock
5. Two clients simultaneously hold the lock
Which is the core of Kleppmann's argument against Redlock's correctness. I think the conclusion Sanfilippo can arrive at is that the algorithm is safer than the single node locking algorithm.
I've personally used Redis locks in my personal projects. The real motivation isn't because it's good or correct but because: Redis is already there in my env.
>The Hacker News user eurleif noticed how it is possible to reacquire the lock as a strategy if the client notices it is taking too much time in order to complete the operation. This can be done by just extending an existing lock, sending a script that extends the expire of the value stored at the key is the expected one. If there are no new partitions, and we try to extend the lock enough in advance so that the keys will not expire, there is the guarantee that the lock will be extended.
Something I don't enjoy about remote/distributed locks is that unlike distributed transactions they're usually unable to provide any strict guarantees about things they protect.
E.g. if you algorithm is:
1) Hold the distributed lock
2) Do the thing
3) Release the lock
And the node goes dark for a while between steps 1 and 2 (e.g. 100% CPU load), by the time it reaches 2 the lock may have already expired and another node is holding it, resulting in a race. Adding steps like "1.1 double/triple check the lock is still held" obviously doesn't help because the node can go dark right after these and resume operation at 2. The probability of these is not too high, but still: no guarantees. Furthermore at a certain scale you do actually start seeing rogue nodes deemed dead hours ago suddenly coming back to life and doing unpleasant things.
The rule of thumb usually is "keep locks within the same transaction space as the thing they protect", and often you don't even needs locks in that case, just transactions can be enough by themselves. If you're trying to protect something that inherently un-transactional then, well, good luck because these efforts are always probabilistic in nature.
A good use-case for a remote lock would be when it's not actually used to guarantee consistency or avoid races, but merely tries to prevent duplicate calculations for cost/performance considerations. For all other cases I outright recommend avoiding them.
A lot of what you say is explained in detail in Martin Kleppmann's article[0]. As you said, there's no guarantee about when the lock will expire. The proper solution for this is a fencing token. The idea is similar to how people have used optimistic locking when updating data in a db to avoid two users overwriting other's work.
Yes, exactly! We found out the hard way just how unreliable Redis-based locks are, and switched to Postgres locks. It works reliably since our code is already in a Postgres transaction.
Created a “lock” table with a single string key column, so you can “select key for update” on an arbitrary string key (similar UX to redis lock). I looked at advisory locks, but they don’t work when the lock key needs to be dynamically generated.
After reading the current[1] top comment about Redlock, this was literally the next low-effort thing that came to mind, so I'm glad to find some else's experiences with using a PostgreSQL table as a lock.
I will need a distributed lock soon, but I've never used one before so I'm taking this chance to learn about them.
If it goes dark a microsecond after #3 you might have an ambiguous success. Transaction processed but you didn't get a confirmation.
A lot of robust systems end up implementing their own bespoke WAL semantics on top of the system of record. It's like we should have a formal solution for doing that by now.
True. Even simple scenarios like "save a file in s3 IFF the s3 link is saved in postgres" which are seen in virtually any application are rarely handled well.
Uggh, I was cornered into writing a couple of these over the years. The way I handled it was:
1. make sure both operations will be retried if they don't run to completion, and
2. think through how the rest of the system would react to one of them being present without the other
Then I use whichever of the two orderings is less bad from the perspective of #2. Obviously this depends on the exact use case -- I was simply lucky that the rest of the system was designed in such a way that it could tolerate that bad intermediate state.
To counter some of the hate in this thread: I have used this to great success as an "opportunistic" locking mechanism to, for example, reduce load on a Postgres database. The winner of the race to acquire the lock would run the (expensive) query then cache the result. On lock release, the nodes waiting on the lock would then try to read the cache before trying to acquire the lock again.
Redis is a very bad store for a distributed lock but Postgres is only slightly better.
What you truly need is something like ZooKeeper and etcd that are designed to achieve distributed consensus using algorithms like Paxos or Raft.
This ensures strong consistency and reliability in a distributed system, making them ideal for tasks like leader election, configuration management, and lease management where consistency across nodes is critical.
Paxos and Raft are consensus algorithms that provide certain guarantees and capabilities that a master-slave system with synchronous replication, such as PostgreSQL, cannot offer.
These algorithms ensure that a majority of nodes (a quorum) must agree on any proposed chAnge. This agreement guarantees that once a decision is made (e.g., to commit a transaction), it is final and consistent across all nodes. This strong consistency is critical in distributed systems to avoid split-brain scenarios.
This is easily caused by :
1-network partition
2-latency issues.
3-Async failover (2 nodes think they are the master)
4-replica lag (some but not all replica acknowledged the write) while master send confirmation to client
Redis Sentinel provides high availability and monitoring for Redis, but it does not guarantee strong consistency.
Linearizability requires that once a write is acknowledged, all subsequent reads should reflect that write.
if min-replicas-to-write is set to the number of Redis replica then if a single node goes down you won't be able to do any write (take lock or release lock).
if min-replicas-to-write is set to any number smaller than the total number or Redis replica some replica could still be lagging because of Asynchronous replication.
Also when a replica acknowledges a write in Redis, it means that the write has been received and logged by the replica, but it doesn’t necessarily mean that the write has been fully processed and applied to the data set.
This mean reading from replica that acknowledges a write from master might still return the Old value for the Key.
It's an out-of-band locking mechanism. In particular, one which mentions "you should consider fencing tokens" merely in passing instead of baking them in from the start.
The only reasonable correctness-oriented view of that is "lol". It's not worth throwing a Jepsen-like test at it, the fundamentals aren't even slightly sound, merely "usually good enough". Whether that's worth it for [use X] depends on that use - often yes!
I have seen (ad hoc) implementations go quite bad many times. I encounter them quite a bit as a distributed replacement for some type of db transaction where the db is something like rds; someone thought to be smart and write 'things at scale' (they read on reddit etc) while a db transaction would've been the correct solution and they didn't need scale anyway (or underestimated the current db capabilities for mysql/postgres).
> I think the Redlock algorithm is a poor choice because it is “neither fish nor fowl”: it is unnecessarily heavyweight and expensive for efficiency-optimization locks, but it is not sufficiently safe for situations in which correctness depends on the lock.
https://martin.kleppmann.com/2016/02/08/how-to-do-distribute...