Skip to main content
SDMastery

System Design Concepts

60 concepts across 8 categories. Master the fundamentals before tackling interview problems.

Core Concepts

View all
beginnerCore Concepts

Scalability

Every production system eventually faces growth. If your architecture cannot scale, you will hit a wall — either the system crashes under load, or you.

10 min read
beginnerCore Concepts

Availability

Users and businesses depend on systems being available. A payment system that goes down for 1 hour can cost millions of dollars.

9 min read
beginnerCore Concepts

Reliability

A system can be available (running) but unreliable (returning wrong results). A payment system that double-charges customers is available but unreliable.

7 min read
beginnerCore Concepts

Single Point of Failure (SPOF)

Identifying and eliminating SPOFs is one of the first things an interviewer expects in a system design discussion.

7 min read
beginnerCore Concepts

Latency vs Throughput vs Bandwidth

Confusing latency and throughput is a common interview mistake. A system can have high throughput but high latency (batch processing), or low latency but.

7 min read
intermediateCore Concepts

Consistent Hashing

Consistent hashing is the backbone of distributed caching (Memcached), distributed databases (DynamoDB, Cassandra), load balancing, and CDNs.

10 min read
intermediateCore Concepts

CAP Theorem

CAP theorem is the most asked theoretical concept in system design interviews. It defines the fundamental constraint of distributed systems.

17 min read
beginnerCore Concepts

Failover

Without failover, any single component failure can bring down your entire system. Failover is how you achieve high availability in practice — it is the.

6 min read
intermediateCore Concepts

Fault Tolerance

In large-scale systems, component failures are not exceptions — they are the norm.

7 min read
beginnerCore Concepts

System Design Fundamentals

A comprehensive overview of what system design is, why it matters for every software engineer, and the foundational building blocks that every production.

13 min read

Networking

View all

Databases

View all
intermediateDatabases

ACID Transactions

Understanding ACID is essential for choosing between SQL and NoSQL databases. Financial systems require ACID. Social media feeds may not.

6 min read
intermediateDatabases

SQL vs NoSQL

Choosing the right database is one of the most impactful decisions in system design. The wrong choice leads to painful migrations.

7 min read
intermediateDatabases

Database Indexes

Indexes are the single most impactful performance optimization for databases. A query that takes 30 seconds without an index can take 1 millisecond with.

9 min read
advancedDatabases

Database Sharding

When a single database server cannot handle the data volume or query load, sharding is the solution.

16 min read
intermediateDatabases

Data Replication

Every production database uses replication. Without it, a single server failure means data loss and downtime.

9 min read
intermediateDatabases

Database Scaling

The database is almost always the first bottleneck in a growing system. Knowing the scaling playbook — and the order in which to apply techniques — is.

7 min read
intermediateDatabases

Database Types

Choosing the right database for each component of your system is a core design skill.

6 min read
advancedDatabases

Bloom Filters

Bloom filters save expensive disk/network lookups. Before querying a database or cache, check the Bloom filter.

10 min read
advancedDatabases

Database Architectures

Understanding database architectures helps you design systems that meet availability, consistency, and performance requirements.

6 min read
intermediateDatabases

NoSQL Data Modeling

How to model data in NoSQL databases using denormalization, access-pattern-driven design, and practical patterns for document, wide-column, and key-value.

14 min read

Caching

View all

Async Communication

View all

Distributed Systems

View all
intermediateDistributed Systems

Heartbeats in Distributed Systems

Failure detection is the foundation of fault tolerance. Without heartbeats, you cannot know when a server has crashed, and failover cannot begin.

7 min read
intermediateDistributed Systems

Service Discovery

In microservices architectures with dynamic scaling (containers, Kubernetes), services come and go constantly.

6 min read
advancedDistributed Systems

Consensus Algorithms

Without consensus, distributed systems cannot reliably replicate data, elect leaders, or coordinate actions.

9 min read
advancedDistributed Systems

Distributed Locking

Without distributed locks, concurrent processes can cause data corruption, double-spending, overselling inventory, or duplicate processing.

9 min read
advancedDistributed Systems

Gossip Protocol

Gossip protocols enable decentralized failure detection, membership management, and data dissemination without a central coordinator.

6 min read
intermediateDistributed Systems

Circuit Breaker Pattern

Without circuit breakers, a failing downstream service can cascade failures throughout your system.

9 min read
intermediateDistributed Systems

Disaster Recovery

Disasters happen: AWS us-east-1 has had multi-hour outages, entire data centers have lost power, and ransomware attacks have encrypted production.

7 min read
intermediateDistributed Systems

Distributed Tracing

In a microservices system, a single user request may pass through 10+ services. When something is slow or fails, you need to see the entire chain to find.

6 min read
advancedDistributed Systems

Leader Election

How distributed systems elect a single leader to coordinate work, covering Raft, Bully, and Ring algorithms, along with real-world implementations in.

14 min read

Architecture Patterns

View all