Consistency vs Availability

The CAP theorem forces distributed systems to choose between consistency and availability during network partitions.

The CAP theorem forces distributed systems to choose between consistency and availability during network partitions. This tradeoff shapes the design of every database and distributed service in production.

Which Should You Pick?

System architecture diagram for Consistency vs Availability showing how services, databases, and caches connect — System architecture for Consistency vs Availability

Favor Consistency (CP) if:

Stale data causes financial loss or safety risk (banking, inventory, medical records)
Users expect their writes to be immediately visible to all readers
You can tolerate brief periods of unavailability during partitions
Your system handles transactions that must be atomic

Favor Availability (AP) if:

Users prefer a potentially stale response over no response
The system serves read-heavy workloads where freshness is less critical
You need 99.99% uptime across geographic regions
Business logic can handle and reconcile conflicting versions

Understanding Consistency

Step-by-step diagram showing how Consistency vs Availability processes a request from start to finish — How Consistency vs Availability works step by step

In a consistent system, every read returns the most recent write. If you update your email address, the very next read — from any node, in any data center — returns the new email.

Achieving strong consistency in a distributed system requires coordination. Before a write is acknowledged, it must be replicated to a quorum of nodes. Before a read is served, it must verify it has the latest data.

The cost: Coordination takes time. A write to a strongly consistent system must wait for acknowledgment from multiple nodes, which may be in different data centers with 50-100ms network latency between them. During a network partition, nodes on the minority side of the partition cannot serve reads or accept writes — the system is unavailable for those clients.

Google Spanner is the canonical CP system. It uses synchronized clocks (TrueTime) and two-phase commits to provide globally consistent reads. The tradeoff is that writes are slower (cross-region consensus) and during partitions, affected regions become unavailable. Google accepts this tradeoff because financial and advertising data requires correctness.

PostgreSQL with synchronous replication is another CP choice. Writes are not acknowledged until the replica confirms. If the replica is unreachable, writes block.

Understanding Availability

Comparison table for Consistency vs Availability contrasting approaches, tradeoffs, and when to use each — Comparing key metrics for Consistency vs Availability

In a highly available system, every request receives a response (success or failure), even during network partitions. The system never refuses to serve a client, but the data returned might be stale.

The cost: During a partition, different nodes may have different versions of the data. If node A accepted a write that has not yet propagated to node B, a read from node B returns stale data. After the partition heals, the system must reconcile divergent versions — which may require conflict resolution.

Amazon DynamoDB (in its default eventually consistent mode) is the canonical AP system. During a partition, all nodes continue accepting writes. Conflicting writes are resolved using last-writer-wins or application-level resolution. Amazon's shopping cart system was built on this principle: it is better to show a slightly outdated cart than to show an error page during a network issue.

Cassandra is another AP system by default. With a replication factor of 3 and consistency level of ONE, reads and writes succeed as long as a single replica is reachable. You can tune Cassandra toward consistency by using QUORUM reads/writes, but this reduces availability.

The Spectrum Between C and A

Data flow diagram for Consistency vs Availability showing how requests and responses move through the system — Data flow through Consistency vs Availability

The CAP theorem presents a binary choice, but real systems operate on a spectrum. Most production systems use tunable consistency:

DynamoDB lets you choose per-request: eventually consistent reads (AP behavior, lower latency) or strongly consistent reads (CP behavior, higher latency). Writes can require one replica (faster, AP) or a quorum (slower, CP).

Cassandra lets you set consistency levels per query: ONE, QUORUM, ALL. With ONE, you get availability. With ALL, you get consistency. With QUORUM (majority), you get a practical middle ground.

CockroachDB is strongly consistent by default but allows stale reads via follower reads when you want lower latency at the cost of potential staleness.

Diagram showing the key components and data flow in a Consistency vs Availability system design — Real systems operate on a spectrum between consistency and availability

Real-World Decision Patterns

Component diagram for Consistency vs Availability showing each building block and its responsibility — Key components of Consistency vs Availability

Banking and payments: Strong consistency. A double-charge or lost transaction is unacceptable. Stripe, Square, and traditional banks use CP databases (PostgreSQL, Spanner) for financial records. They accept slightly higher latency and brief unavailability during partitions because correctness is non-negotiable.

Social media feeds: Eventual consistency. If a user posts a photo and their friend sees it 2 seconds later instead of instantly, nobody notices. Facebook uses an AP approach for the news feed with async replication across data centers. The engineering cost of global strong consistency for billions of daily posts would be enormous with minimal user-visible benefit.

Inventory systems: Depends on the cost of overselling. Amazon uses a combination: optimistic checks with eventual consistency for browsing ("In Stock" labels) but strongly consistent checks at checkout to prevent overselling high-value items. Low-value items might accept occasional overselling and resolve it after the fact.

DNS: Extremely available, eventually consistent. DNS caches stale records for hours. When you update a DNS record, it propagates globally over minutes to hours. This is acceptable because DNS changes are infrequent and the system must never go down.

Conflict Resolution Strategies

Tradeoff analysis for Consistency vs Availability listing advantages, disadvantages, and real-world considerations — Advantages and disadvantages of Consistency vs Availability

When you choose availability, conflicting writes will happen. You need a strategy to resolve them:

Last-writer-wins (LWW): Use timestamps to pick the most recent write. Simple but lossy — concurrent writes are silently discarded. DynamoDB uses this by default.

Application-level resolution: Present all conflicting versions to the application and let it merge them. Amazon's shopping cart uses union merge: conflicting cart versions are combined by taking the union of items. This ensures no item is lost, though a deleted item might reappear.

CRDTs (Conflict-free Replicated Data Types): Data structures designed to merge automatically without conflicts. Counters, sets, and maps have CRDT variants. Riak and Redis (in CRDB mode) support CRDTs.

Side-by-Side Comparison

Dimension	Consistency (CP)	Availability (AP)
During Partition	Refuses some requests	Serves all requests
Data Freshness	Always current	May be stale
Write Latency	Higher (quorum required)	Lower (single node ack)
Conflict Handling	Prevention (locking)	Resolution (merge)
Use Cases	Finance, inventory, auth	Social, analytics, caching
Examples	Spanner, PostgreSQL, ZooKeeper	DynamoDB, Cassandra, DNS

In practice, most systems do not make a single global choice. They use strong consistency for the data that requires it (account balances, authentication tokens) and eventual consistency for everything else (recommendations, activity feeds, analytics). The skill is knowing which data falls into which category.

Production deployment examples of Consistency vs Availability at companies like Netflix, Google, and Amazon — Real-world examples of Consistency vs Availability

Consistency vs Availability decision framework for choosing the right approach — Consistency vs Availability — Decision

Interview preparation checklist for Consistency vs Availability with key points to mention and mistakes to avoid — Consistency vs Availability — Interview Tips

Practical Implementation for .NET Developers

In a .NET application, you would typically implement this pattern using the following approach:

ASP.NET Core setup: Create a service class that encapsulates the logic, register it with dependency injection, and inject it into your controllers or minimal API endpoints. The built-in DI container handles lifecycle management.

Entity Framework Core: For database interactions, EF Core provides the ORM layer. Use migrations for schema management and raw SQL for performance-critical queries. Consider Dapper for read-heavy paths where EF Core's overhead matters.

Azure integration: If deploying to Azure, leverage managed services — Azure Cache for Redis, Azure SQL, Azure Service Bus, Azure Cosmos DB. These eliminate operational overhead and provide built-in monitoring through Application Insights.

Testing: Use xUnit with Testcontainers for integration tests that spin up real databases in Docker. Mock external dependencies with NSubstitute. The WebApplicationFactory class lets you test your entire HTTP pipeline in-process.

Monitoring: Add Application Insights telemetry to track request latency, dependency calls, and custom metrics. Use structured logging with Serilog to make production debugging possible:

text

Log.Information("Processing order {OrderId} for {CustomerId}", orderId, customerId);

This gives you searchable, structured logs in Azure Monitor or Seq.

Sources

Brewer's CAP Theoremarticle