State and consistency
When the same data lives in many places, keeping it coherent is the core challenge.
- Distinguish strong from eventual consistency
- Explain why shared distributed state is hard
- Recognise idempotency as a tool for unreliable delivery
The concurrency lesson warned that shared mutable state is the hard part of doing many things at once. Distribute that state across machines and it gets harder still: now the copies can disagree, and the network between them is unreliable. Managing distributed state coherently is the central problem of the field.
Strong vs eventual consistency
When data is replicated across nodes, you choose how synchronised the copies must be:
- Strong consistency: every read returns the most recent write. Simple to reason about, but slower and less available — writes must coordinate across replicas before succeeding.
- Eventual consistency: replicas may briefly disagree but converge if writes stop. Fast and highly available, at the cost of possibly reading stale data for a moment.
This is the CAP trade-off made concrete. A "like" count can be eventually consistent (a stale count is harmless); an account balance usually wants strong consistency. Match the guarantee to what the data can tolerate.
Idempotency for unreliable delivery
Because a network call might be retried (you couldn't tell if the first attempt landed), operations that change state should be idempotent — safe to apply more than once with the same result. "Set balance to 100" is idempotent; "add 100" is not. Designing for idempotency (often via a unique request id the server de-duplicates) is how you survive the retries that distribution forces on you.
Prefer not to share state
The cleanest distributed designs minimise shared mutable state, echoing the concurrency lesson:
- Keep services stateless where possible — push state into a dedicated data store so any instance can handle any request.
- Give each service ownership of its own data; others ask via its API rather than reaching into its database.
- Use a single source of truth for each piece of data, rather than several copies that can drift.
Distributed transactions across services are notoriously hard and slow — avoid them when you can. It's often better to design so each step is independently correct and idempotent than to coordinate one big all-or-nothing operation.
Where to go next
One of the most common — and most dangerous — ways state goes stale is caching. Next: caching.