Salvatore Sanfilippo (author of Redis and Redlock) wrote a response to Martin Kl...

dang · on Aug 22, 2024

Discussed at the time:

Is Redlock Safe? Reply to Redlock Analysis - https://news.ycombinator.com/item?id=11065933 - Feb 2016 (135 comments)

zinodaur · on Aug 22, 2024

> The algorithm's goal was to move away people that were using a single Redis instance, or a master-slave setup with failover, in order to implement distributed locks, to something much more reliable and safe, but having a very low complexity and good performance.

I think this is good perspective. More reliable + more safe + good performance - Fine, its not perfect, but I bet if you are currently using a single node redis lock and keep running into problems when it goes down, these improvements sound nice.

Some of antirez's comments surprise me a bit though

> A distributed lock without an auto release mechanism, where the lock owner will hold it indefinitely, is basically useless.

I have found durable locks very practical and useful

hinkley · on Aug 22, 2024

Durable locks have a partitioning problem. If the lock holder gets hit by a tornado or catches on fire then there is no recovery method short of manual intervention.

I took a formal class on distributed systems back when dinosaurs roamed the earth and the implementation of Ethernet was still considered interesting. And even back then we talked about leases for locks.

zinodaur · on Aug 22, 2024

We have things we call "durable locks" (but it sounds like thats a loaded term that I don't know the meaning of) that work by recording lock holders in persistent storage + use a corresponding volatile lock when the lock holders need to assert ownership (e.g. to perform a write).

in our system, the only programs that are allowed to take "durable locks" are ones that are guaranteed to complete (ie, their existence is also recorded in persistent storage, and they are retried until completion). The "durable" part means that even if they restart or die, other writers cant jump in and screw things up. The "volatile" part guarantees that only one of them will be writing at the same time.

I wonder what Martin would have to say about our weird little locks

kijin · on Aug 22, 2024

The thing about most modern web backends is that virtually nothing is guaranteed to complete.

The processes that use locks are often short-lived. They live in short-lived containers with no state, or maybe they're just lambdas executing under a strict resource limit. Either way, there's nobody to clean up after them or restart them once they're killed. When they begin a database transaction and then disappear for any reason, the best practice is to roll back and pretend they never did anything.

In this brave new world of YOLO lock holders, antirez's position makes a lot of sense. There's definitely still room for old-fashioned durable locks, but these are different use cases.

hinkley · on Aug 22, 2024

Autoscaling might mean there isn’t even a machine that corresponds to that dead server for days, weeks, or months.

What GP said sounds like it has leases of a fashion. Maybe not the jargon I’d choose to describe them but the industry is full of misleading names for things.

pookybear223 · on Aug 22, 2024

time have changed though. there are better implementations of these, and many companies have built successful startups based around these ideas. look around, it’s the age of distributed locking!

wesselbindt · on Aug 22, 2024

I think this is a good read, but not so much a rebuttal. He never really addresses the following scenario:

1. Get the current time.

2. … All the steps needed to acquire the lock …

3. Get the current time, again.

4. Check if we are already out of time, or if we acquired the lock fast enough.

4.5. client pauses for whatever reason (the example Kleppmann gives is a GC pause), long enough for the lock to expire

4.75. another client acquires the lock

5. Two clients simultaneously hold the lock

Which is the core of Kleppmann's argument against Redlock's correctness. I think the conclusion Sanfilippo can arrive at is that the algorithm is safer than the single node locking algorithm.