Code of the Day

MVCC and concurrency

How databases serve concurrent readers and writers without locking, and the storage cost of keeping multiple row versions.

Database Internals5 min read
By the end of this lesson you will be able to:
  • Describe how MVCC allows concurrent readers and writers without locking

MVCC for concurrent readers and writers

A naive way to implement isolation is to lock every row being read. This is simple but slow: a long-running read holds a lock that blocks all writers.

Multi-Version Control (MVCC) takes a different approach: instead of locking, the database keeps multiple versions of each row. When a modifies a row, it writes a new version tagged with that transaction's timestamp (or transaction ID), leaving the old version in place. Other transactions that started before the modification see the old version; transactions that start after the commit see the new version.

Row 42 at t=100: { status: 'pending',  xmin: 50,  xmax: 110 }
Row 42 at t=110: { status: 'shipped',  xmin: 110, xmax: null }

Transaction started at t=105 reads Row 42 → sees 'pending' (old version still valid)
Transaction started at t=115 reads Row 42 → sees 'shipped' (new version)

The consequence: readers never block writers and writers never block readers. Concurrent reads and writes proceed in parallel without contention. The cost is storage: old row versions must be retained until no active transaction can still reference them. PostgreSQL's VACUUM process (and MySQL's InnoDB purge thread) periodically remove stale versions that no running transaction can see.

Isolation levels and MVCC limits

MVCC solves most read-write contention but does not eliminate all anomalies. Write skew (two transactions each read disjoint data and make decisions that together violate a constraint) can still occur at the REPEATABLE READ level.

Full SERIALIZABLE isolation prevents write skew but uses predicate locks or optimistic concurrency checks that add overhead. Choose the isolation level deliberately — defaulting to SERIALIZABLE everywhere is rarely necessary and always costly.

MVCC is why a long-running transaction is expensive even if it is only reading. It holds a "snapshot" of the database at the time it started — preventing VACUUM from removing any row version that postdates that snapshot. A transaction open for hours can cause table bloat (PostgreSQL calls this "transaction ID wraparound risk" in extreme cases). Always set statement timeouts and transaction timeouts in production.

Where to go next

Database engines build on nearly every concept in these tracks: they exploit the memory hierarchy (Computer Architecture) for buffer pool management, cross the user/kernel boundary (Operating Systems) on every disk write, and receive queries over network connections (Networking Fundamentals). The SQL track applies these internals directly: after reading this lesson, the SQL intermediate modules on indexes and query planning will make considerably more sense.

Knowledge check

  1. 1.
    Under MVCC, a long-running read transaction blocks writers from updating the rows it is reading.
  2. 2.
    PostgreSQL's VACUUM process serves what purpose in the context of MVCC?
Finished reading? Mark it complete to track your progress.

On this page