Idempotency
Network failures are inevitable. Clients will retry requests. Without idempotency, retries can cause catastrophic bugs — a payment system that charges.
An operation is idempotent if performing it multiple times produces the same result as performing it once. In APIs, idempotency ensures that retrying a request (due to network timeouts, client errors, or duplicate submissions) does not cause unintended side effects like double charges, duplicate orders, or redundant emails.
Why This Matters
Network failures are inevitable. Clients will retry requests. Without idempotency, retries can cause catastrophic bugs — a payment system that charges twice, an order system that ships duplicate items. Idempotency is essential for reliable distributed systems.
The Building Blocks
- Naturally idempotent: GET, PUT, DELETE are idempotent by design. Getting the same resource twice returns the same result. Deleting a resource that is already deleted is a no-op.
- Not naturally idempotent: POST is not idempotent — posting an order twice creates two orders. You must add idempotency explicitly.
- Idempotency keys: The client sends a unique key (UUID) with each request. The server stores the key and result. If the same key arrives again, the server returns the cached result without re-executing.
- Database-level idempotency: Use unique constraints (ON CONFLICT DO NOTHING) or conditional writes (IF NOT EXISTS) to prevent duplicates.
- At-least-once delivery: Many message queues deliver messages at least once. Your consumer must be idempotent.
Under the Hood
For payment APIs: The client generates an idempotency key (UUID) and sends it with the payment request header. The server checks if this key exists in a store (Redis). If yes, return the stored result. If no, process the payment, store the result keyed by the idempotency key, and return the result.
This ensures that even if the client retries 10 times, the payment is only charged once.
How Companies Actually Do This
Stripe requires an Idempotency-Key header on all POST requests. Retrying with the same key returns the original response.
Amazon SQS provides at-least-once delivery. Consumers must be idempotent to handle duplicate messages.
Airbnb uses idempotency keys in their payments system to prevent double-charging guests.
Common Pitfalls
- Not implementing idempotency for payment or order APIs — guaranteed double-charge bugs
- Setting idempotency key TTL too short — client may retry after the key expires
- Using non-unique idempotency keys — defeats the entire purpose
Interview Questions Worth Practicing
- What is idempotency and why is it important?
- Which HTTP methods are naturally idempotent?
- How do you implement idempotency in a payment system?
- What is at-least-once vs exactly-once delivery?
The Tradeoffs
- Storage overhead: Storing idempotency keys and results requires extra storage and TTL management.
- Complexity: Adding idempotency logic to every endpoint adds development time.
- Concurrency: Two identical requests arriving simultaneously may both miss the key check — use database locks or compare-and-swap.
Related Topics
The Real-World Incident That Made This Famous
Understanding Idempotency became critical after multiple high-profile production incidents at major tech companies. When systems handle millions of users, even small misunderstandings about Idempotency can lead to cascading failures that cost millions in lost revenue and erode user trust. Companies like Netflix, Google, Amazon, and Meta have all invested heavily in mastering Idempotency because they learned the hard way that ignoring it leads to outages.
The key lesson from these incidents: Idempotency is not just a theoretical concept — it is a practical skill that separates engineers who build resilient systems from those who build fragile ones.
How Senior Engineers Think About This
Senior engineers approach Idempotency differently from textbook definitions. Instead of memorizing rules, they build mental models. They ask: "What problem does Idempotency solve? When does it fail? What are the alternatives?" This problem-first thinking leads to better design decisions because every system has unique constraints.
When evaluating Idempotency in a system design context, experienced engineers consider the failure modes first. What happens when this component goes down? How does the system degrade? Is the degradation graceful or catastrophic? These questions reveal more about your understanding than any textbook definition.
Common Interview Mistakes
Mistake 1: Giving a textbook definition without context. Interviewers want to see you connect Idempotency to real systems and real problems.
Mistake 2: Not discussing trade-offs. Every design decision involving Idempotency has trade-offs. Discuss what you gain and what you give up.
Mistake 3: Overcomplicating the solution. Start with the simplest approach to Idempotency that meets the requirements, then add complexity only when justified.
Production Checklist
- Define clear metrics for measuring the effectiveness of your Idempotency implementation
- Set up monitoring and alerting that specifically tracks Idempotency-related failures
- Document your Idempotency design decisions in Architecture Decision Records (ADRs)
- Test failure scenarios related to Idempotency in staging before production deployment
- Review and update your Idempotency implementation quarterly as system requirements evolve
- Train new team members on the specific Idempotency patterns used in your system
Read the original source | Content from System-Design-Overview
Practical Implementation for .NET Developers
In a .NET application, you would typically implement this pattern using the following approach:
ASP.NET Core setup: Create a service class that encapsulates the logic, register it with dependency injection, and inject it into your controllers or minimal API endpoints. The built-in DI container handles lifecycle management.
Entity Framework Core: For database interactions, EF Core provides the ORM layer. Use migrations for schema management and raw SQL for performance-critical queries. Consider Dapper for read-heavy paths where EF Core's overhead matters.
Azure integration: If deploying to Azure, leverage managed services — Azure Cache for Redis, Azure SQL, Azure Service Bus, Azure Cosmos DB. These eliminate operational overhead and provide built-in monitoring through Application Insights.
Testing: Use xUnit with Testcontainers for integration tests that spin up real databases in Docker. Mock external dependencies with NSubstitute. The WebApplicationFactory class lets you test your entire HTTP pipeline in-process.
Monitoring: Add Application Insights telemetry to track request latency, dependency calls, and custom metrics. Use structured logging with Serilog to make production debugging possible:
Log.Information("Processing order {OrderId} for {CustomerId}", orderId, customerId);
This gives you searchable, structured logs in Azure Monitor or Seq.