Concurrency Control Protocols
Imagine a bank account shared by you and your partner. If you both try to withdraw money at the exact same millisecond, who gets the cash? Concurrency control is the "traffic police" of database systems that prevents chaos when multiple transactions access the same data simultaneously.
lightbulb Real World Analogy
Without protocols, a database is like a 4-way intersection without traffic lights or stop signs—crashes (data corruption) are inevitable. Protocols act as the signals, roundabouts, or police officers ensuring everyone gets through safely, even if they have to wait a bit.
Lock-Based Protocols (2PL)
Two-Phase Locking (2PL) is the most common protocol. It ensures serializability by forcing transactions to acquire locks in two distinct phases. It prevents "interleaving" errors by physically locking data items.
-
1
Growing Phase:
Transaction may obtain locks, but cannot release any lock.
-
2
Shrinking Phase:
Transaction may release locks, but cannot obtain any new locks.
Visualizing 2PL
Analogy: The Library Rule
Imagine you are researching a paper.
Growing Phase: You run around the library grabbing all 5 books you need. You cannot sit down and read until you have them all.
Lock Point: You have all 5 books.
Shrinking Phase: As you finish a chapter, you return the books one by one to the shelf for others to use. You cannot grab new books once you start returning them.
Interactive Lock Simulator
Try to have Transaction B write to Item X while Transaction A holds a lock.
Timestamp Ordering
Instead of locks, every transaction gets a unique timestamp when it starts (TS). The protocol ensures that for every pair of conflicting operations, the order of execution matches the order of timestamps. Older transactions (smaller TS) have priority.
Logic Flow
Analogy: The Ticket Queue
Think of a deli counter. Everyone pulls a numbered ticket (Timestamp).
If Person #10 wants to order a sandwich, but sees that Person #20 has already been served that last bagel, Person #10 cannot order it.
In databases, if an "older" transaction tries to read/write data that a "younger" transaction has already touched, the older one is too late. It must restart (get a new ticket).
Optimistic Control (Validation)
"Ask for forgiveness, not permission." Optimistic protocols assume conflicts are rare. They let transactions execute freely without locks, and check for conflicts only at the very end before committing.
1. Read Phase
Execute on local copies. No global writes.
2. Validation Phase
Check if my local changes conflict with others.
3. Write Phase
If validation passes, copy changes to DB.
Analogy: Collaborative Coding
Like using Git.
Read: You clone the repo and work locally on your laptop. You don't lock the main server.
Validate: You try to push (Merge Request). Git checks if anyone else modified the same lines while you were working.
Write: If no conflicts, your code merges. If conflicts, you fix (restart) and try again.
Multi-Version Concurrency Control (MVCC)
Used by PostgreSQL, Oracle, and MySQL (InnoDB). The system keeps multiple versions of data items.
Readers don't block Writers, and Writers don't block Readers.
Database State (Visualized)
Started before T5 finished.
Reads Ver 1
Interview Quickfire
Common Questions
Basic 2PL: Allows a transaction to release locks during the shrinking phase before it commits. This maximizes concurrency but can lead to Cascading Rollbacks if the transaction aborts after releasing locks.
Strict 2PL: Holds all exclusive (write) locks until the transaction commits or aborts. This prevents cascading rollbacks and ensures a cleaner recovery schedule.
It occurs when a transaction reads a set of rows satisfying a condition (e.g., "SELECT * FROM Users WHERE Age > 18"), and another transaction inserts a new row that meets that condition. If the first transaction runs the same query again, it sees a "phantom" row that wasn't there before. Locking individual rows isn't enough; you often need Index Locking or Gap Locking to prevent this.
Use Pessimistic (Locks) when write conflicts are high (e.g., ticket booking systems where everyone wants the same seat). The cost of locking is cheaper than rolling back.
Use Optimistic when conflicts are rare (e.g., browsing a product catalog or editing a personal profile). It avoids the overhead of managing locks.