Design UPI Payment System
Design a UPI-like payment system with real-time payment flows, idempotency, settlement, and fraud detection. Covers NPCI architecture patterns.
Problem Statement
Design a Unified Payments Interface (UPI) system that enables instant bank-to-bank transfers using a virtual payment address (VPA). The system must process payments in real time, guarantee exactly-once execution through idempotency, handle settlement between banks, and detect fraudulent transactions -- all at the scale of billions of transactions per month.
Requirements
Functional
- Register VPA (e.g., user@bank) mapped to bank account
- Send money: payer initiates transfer to a payee VPA, debiting payer's bank and crediting payee's bank
- Collect request: payee requests money from payer (payer must approve)
- Transaction history with status tracking (pending, success, failed)
Non-Functional
- Latency: End-to-end payment <5 seconds including bank round-trips
- Consistency: Exactly-once execution -- no double debits, no lost credits
- Availability: 99.95% -- failed payments must auto-reverse within 24 hours
- Scale: 10B transactions/month, 50K TPS at peak
Core Architecture
-
Payment Switch (NPCI-equivalent) -- Central routing layer that resolves payee VPA to the payee's bank, routes the debit request to the payer's bank and the credit request to the payee's bank. Uses a two-phase protocol: debit-first, then credit. If credit fails, initiates auto-reversal of the debit.
-
Idempotency Engine -- Every transaction carries a unique idempotency key (client-generated UUID). The engine stores key-to-outcome mappings in a fast lookup store. Duplicate requests return the cached outcome without re-executing. Keys expire after 24 hours.
-
Settlement Service -- Banks do not move actual money per transaction. Instead, the settlement service aggregates net positions between banks every 15 minutes and settles via a central clearing account. Generates settlement files in ISO 20022 format.
- Fraud Detection Pipeline -- Real-time scoring engine that evaluates each transaction against velocity rules (e.g., >5 transactions in 1 minute), amount thresholds, device fingerprint changes, and ML-based anomaly scores. High-risk transactions are held for step-up authentication (PIN re-entry).
Database Choice
PostgreSQL for the VPA registry and transaction ledger. The ledger table is append-only with columns: txn_id, payer_vpa, payee_vpa, amount, status, created_at, updated_at. Status transitions (INITIATED -> DEBIT_SUCCESS -> CREDIT_SUCCESS or REVERSED) are serialized via row-level locks on txn_id. Redis for idempotency key lookup (SET NX with 24h TTL) and rate limiting. Kafka for the event bus between payment switch, banks, and settlement.
Key API Endpoints
POST /api/v1/pay
-> Body: \{ payer_vpa: "alice@bank1", payee_vpa: "bob@bank2", amount: 500.00, idempotency_key: "uuid-123" \}
-> Returns: \{ txn_id: "TXN-789", status: "INITIATED" \}
GET /api/v1/transactions/\{txn_id\}
-> Returns: \{ txn_id: "TXN-789", status: "SUCCESS", amount: 500.00, timestamp: "..." \}
POST /api/v1/collect
-> Body: \{ requester_vpa: "bob@bank2", payer_vpa: "alice@bank1", amount: 200.00, note: "Dinner split" \}
Scaling Insight
The debit-first, credit-second pattern with auto-reversal is the key reliability mechanism. By always debiting first, the system ensures money is never created out of thin air. If the credit leg fails (payee bank down), a reversal job re-credits the payer within minutes. This two-phase approach avoids the complexity of distributed transactions across banks while maintaining financial integrity.
Key Tradeoffs
| Decision | Option A | Option B | Chosen |
|---|---|---|---|
| Transaction model | Synchronous 2PC across banks | Debit-first with async reversal | Debit-first -- banks are independent systems, 2PC is impractical across org boundaries |
| Settlement | Real-time gross (per txn) | Net settlement (batched) | Net settlement -- 1000x fewer inter-bank transfers, lower cost |
| Fraud detection | Pre-transaction blocking | Post-transaction flagging | Pre-transaction -- prevents fraud rather than chasing refunds, slight latency cost acceptable |
Practical Implementation for .NET Developers
In a .NET application, you would typically implement this pattern using the following approach:
ASP.NET Core setup: Create a service class that encapsulates the logic, register it with dependency injection, and inject it into your controllers or minimal API endpoints. The built-in DI container handles lifecycle management.
Entity Framework Core: For database interactions, EF Core provides the ORM layer. Use migrations for schema management and raw SQL for performance-critical queries. Consider Dapper for read-heavy paths where EF Core's overhead matters.
Azure integration: If deploying to Azure, leverage managed services — Azure Cache for Redis, Azure SQL, Azure Service Bus, Azure Cosmos DB. These eliminate operational overhead and provide built-in monitoring through Application Insights.
Testing: Use xUnit with Testcontainers for integration tests that spin up real databases in Docker. Mock external dependencies with NSubstitute. The WebApplicationFactory class lets you test your entire HTTP pipeline in-process.
Monitoring: Add Application Insights telemetry to track request latency, dependency calls, and custom metrics. Use structured logging with Serilog to make production debugging possible:
Log.Information("Processing order {OrderId} for {CustomerId}", orderId, customerId);
This gives you searchable, structured logs in Azure Monitor or Seq.
System-Specific Clarifying Questions
Before designing Design Upi, ask questions specific to THIS system:
- Who are the primary users? Understanding the user base shapes every technical decision — consumer apps have different requirements than enterprise B2B systems.
- What is the read-to-write ratio? This determines whether you optimize for fast reads (caching, denormalization) or fast writes (write-ahead logs, async processing).
- What is the geographic distribution? Users in one country vs. global users fundamentally changes your data replication and CDN strategy.
- What is the acceptable latency? Some features need sub-100ms responses, others can tolerate seconds. This determines your caching and architecture strategy.
- What is the consistency requirement? Some data (payments, inventory) needs strong consistency. Other data (social feeds, recommendations) can be eventually consistent.
Architecture Deep Dive
The architecture for Design Upi should be designed around the specific access patterns of the system. Do not apply generic templates — every system has unique hotspots, bottlenecks, and scaling challenges.
Write Path: How does data enter the system? Is it bursty (event-driven, flash sales) or steady (sensor data, logs)? Bursty writes need queuing and backpressure. Steady writes can go directly to the database.
Read Path: How is data consumed? Is it fan-out (one write, many reads like social feeds) or point lookups (one read for specific data like user profiles)? Fan-out reads benefit from pre-computation and caching. Point lookups benefit from efficient indexing.
Hot Spots: Where are the bottlenecks? For Design Upi, identify the component that will fail first under load and design mitigation strategies: caching, sharding, rate limiting, or async processing.