What are Database Locks?
A peek into the world of database locks, their types and issues associated
Hello 👋🏼,
Denny here.
The past couple months were quite busy for me and I apologize for not being able to write to you. Now that I have shifted to a new city and have settled at my job, I am finding more time to resume these newsletter.
Today we will talk about database locks and why you need them. I am aiming at publishing this as a two-part issue since it’s a fairly big topic. This week we will cover the basics of locking, its types, and issues associated with locks.
Let’s get into the topic now ✨
Concurrency control is the term you should be aware of before getting into locks.
In a large application accessed by thousands of users, concurrency is inevitable. Your application need to be able to handle multiple requests simultaneously.
When you execute operations concurrently, the results can be conflicting. For e.g. if you are reading a row while someone else is writing to it simultaneously, then you are bound to get inconsistent data.
If we execute these transactions sequentially then we don’t need concurrency control. But sequential execution affects the scaling of the systems, hence we cannot avoid concurrent transactions.
When the operations are executed concurrently, how do we make sure the results have consistency?
There are various techniques for concurrency control. And one of them is using locks 🔐
So what are locks exactly? All you have to know is that locks are one mechanism to ensure data integrity. How these locks are implemented is beyond the scope of this issue.
So how do they work? Locks work by letting only one transaction modify or access a certain row or table. This makes it look like the transactions are executed one after the other and hence maintaining data integrity.
Different types of locks
There are majorly 2 types of locks. Exclusive locks and shared locks.
Shared locks let you read the row or the table that is being locked. Hence it’s also called a read lock. Multiple transactions (you can think of each transaction as a separate process) can acquire a shared lock on the same resource and read from it. No transaction is allowed to update the resource while it has a shared lock.
Exclusive locks lock the row or table entirely and let the transaction update the row in isolation. Unlike shared locks, only one transaction can acquire an exclusive lock on a certain resource at one point in time. And while it acquires the lock on that resource, other processes that want to acquire the lock on the same resource will have to wait. Once the lock is released, the remaining processes can acquire it and make modifications.
You should note that multiple shared locks can be acquired on a resource at one time. But if the resource already has a shared lock then another process cannot acquire an exclusive lock on it.
Similarly, a process cannot acquire a shared lock on a resource that is locked by an exclusive lock.
There are other types of locks that are modifications to the above. For e.g. write locks which lets you read a resource while it’s been updated.
Optimistic vs pessimistic locking
This one is an interesting concept. See how I mentioned above that all the parallel processes that want to do database transactions will have to sequentially execute them in order to maintain data integrity?
But that can slow things down quite a lot, especially when the transactions involve multiple steps or updates.
So there are two ways of going ahead with it.
One approach is to let the processes update the same records in parallel. Only when one process successfully commits its changes, the other process is told that there exists a conflict, due to which the other process will have to attempt the transaction again.
Another approach is to lock the row as soon as one process attempts to modify it (or delete it) and ask the other processes to wait before doing anything.
The first approach is lovingly named optimistic locking. It’s letting everyone have a chance at updating the records.
The second approach is called pessimistic locking wherein the first one to attempt gets the chance to update the record.
The difference these two methods induce can be felt in large transactions. Remember that a transaction can have a single step where it updates a row, or it can have multiple steps where it fetches a column from a row, queries another table, deletes a couple of rows, etc. In a transaction with multiple steps, having a pessimistic lock may not be a good idea since if the transaction locking the rows fails in between, the time that it spent has gone to waste. While in such scenarios if you have optimistic locking, other processes might be able to complete their transactions.
Problems associated with locks
Locks can cause tricky issues at times. What if the process that acquired the lock fails to release it, locking the entity from being accessed by other processes? It often happens that transactions go idle. If you haven’t set a timeout for transactions, they may acquire a lock and not release it.
Another issue is with lock contentions. There might be resources in your database that are accessed a lot. A number of transactions might try to acquire locks on these resources. As subsequent transactions have to wait for the others to finish, there might be a delay in executing these transactions leading to the contention of these resources or locks. It’s like one of those single checkout counters in stores where everyone has to wait for their turn.
Another issue that usually occurs with locks is a deadlock situation. In this scenario, one transaction acquires a lock on a resource (resource A) and tries to access the lock on another resource (resource B) before releasing the lock on the first resource. At the same time, another resource acquires the lock on resource B and is aiming to acquire the lock on resource A. Now both the processes are not complete without acquiring these locks.
But since resource B is locked by the second process, the first process cannot capture a lock on it. Similarly, the first process has acquired a lock on resource A, so the second process cannot acquire the lock on it. These two processes keep fighting for each other’s locks leading to a deadlock situation.
Let’s look at another issue associated with locks.
Row-level locks are the most common and the safest form of locking. But since locks are maintained in the memory of the database, storing them carries a cost. If there are too many locks on row-level and your database is running out of memory, it might escalate the lock to table-level to free up some memory. This can have implications you may not have anticipated!
Wrapping up
We explored the concept of locks and the issues associated with them. In next week’s issue, we will dive a bit more into the intricacies of locking and explore locks available in Postgres.
📖 Read recommendations for the week
If you like my content and want to support me to keep me going, consider buying me a coffee ☕️ ☕️ ☕️
Connect with me!
If you need help with your service architecture, you can email me 💌 : dennysam14@gmail.com