Skip to main content
SDMastery
medium7 min readUpdated 2026-06-03

Design TikTok

Design TikTok with short video feed, recommendation algorithm, video upload/processing, and a global CDN.

Design TikTok system design overview showing key components and metrics
High-level overview of Design TikTok

Problem Statement

Design a short-video platform like TikTok with an infinite-scroll "For You" feed, video creation/upload with effects, a recommendation algorithm that learns from user behavior in real time, and global video delivery. Must handle 1B+ MAU with engagement-optimized content ranking.

Requirements

Design TikTok system architecture with service components and data flow
System architecture for Design TikTok

Functional

  • Upload short videos (15s-3min) with filters, music overlay, and effects
  • "For You" feed: personalized infinite scroll of recommended videos from all creators
  • Following feed: videos from followed creators, reverse chronological
  • Interact: like, comment, share, duet/stitch (respond to another video)

Non-Functional

  • Latency: Next video preloaded and starts instantly (0ms perceived wait) during scroll
  • Scale: 1.5B MAU, 1B DAU, 1B video views/hour, 5M new videos uploaded/day
  • Recommendation quality: Users average 90+ minutes/day -- cold start for new users in <10 videos
  • Global: Low-latency delivery to 150+ countries

Core Architecture

Step-by-step diagram showing how Design TikTok works in practice
How Design TikTok works step by step
  1. Video Processing Pipeline -- User uploads video to blob storage. An async pipeline: (1) validates format/length, (2) applies server-side effects if needed, (3) transcodes to multiple bitrates (360p/720p/1080p) in H.264 and H.265, (4) generates thumbnail and preview, (5) runs content moderation (nudity, violence, copyright via audio fingerprinting). Completed videos enter the recommendation candidate pool.

  2. For You Recommendation Engine -- Two-phase: (a) Candidate retrieval selects ~5000 videos from collaborative filtering (similar user behavior), content features (hashtags, audio, visual embeddings), and trending pool. (b) Ranking model predicts engagement probability using features: video quality, creator authority, user interest vector, time-of-day, and diversity requirements. Re-ranks in real time as user engagement signals arrive.

  3. Video CDN with Prefetch -- Videos are distributed to CDN edge nodes globally. The client prefetches the next 3-5 videos in the feed while the user watches the current one. Popular videos are proactively pushed to edges; long-tail videos use pull-through caching. This ensures zero-wait transitions between videos.

Data flow diagram for Design TikTok showing request and response paths
Data flow through Design TikTok
  1. Real-time Engagement Pipeline -- Every interaction (watch time, like, skip, share, replay) is sent to Kafka within 100ms. A Flink streaming job updates the user's interest vector and feeds signals back to the recommendation engine. This creates a tight feedback loop: if a user watches 3 cooking videos in a row, the 4th recommendation shifts toward cooking content.

Database Choice

Cassandra for the user interaction log (likes, watch history) -- write-heavy, time-series, partitioned by user_id. PostgreSQL for video metadata, creator profiles, and follow relationships. Redis for the precomputed feed cache (list of next 100 video_ids per user) and real-time engagement counters. S3/GCS for video storage. Feature Store (Redis/DynamoDB) for serving ML features at inference time with sub-5ms latency.

Interview tips for Design TikTok system design questions
Interview tips for Design TikTok

Key API Endpoints

text
GET /api/v1/feed/foryou?cursor=\{token\}&limit=10
  -> Returns: \{ videos: [\{ video_id, cdn_url, creator, description, music, stats \}], next_cursor \}

POST /api/v1/videos/upload (resumable)
  -> Returns: \{ video_id: "V-789", status: "PROCESSING" \}

POST /api/v1/interactions
  -> Body: \{ video_id: "V-789", type: "watch_time", value_ms: 14200 \}

Scaling Insight

The real-time feedback loop is TikTok's competitive advantage. Traditional recommendation systems update user profiles in batch (hourly/daily). TikTok's Flink pipeline updates the user's interest vector within seconds of each interaction. This means the system adapts to user mood and intent within a single session -- watch 2 videos about dogs, and the 3rd recommendation is already about dogs. This tight loop is why new users get hooked within minutes despite having no history.

Decision guide showing when to use Design TikTok and when to avoid
When to use Design TikTok

Key Tradeoffs

DecisionOption AOption BChosen
Feed generationPre-compute full feedGenerate on-demand per scrollHybrid -- pre-compute next 100, regenerate when 70% consumed
Recommendation latencyBatch daily updatesReal-time per interactionReal-time -- session-level personalization is the core product differentiator
Video codecH.264 only (universal)H.265 + H.264Both -- H.265 saves 40% bandwidth on supported devices, H.264 as fallback

Practical Implementation for .NET Developers

In a .NET application, you would typically implement this pattern using the following approach:

Pros and cons analysis of Design TikTok for system design decisions
Advantages and disadvantages of Design TikTok

ASP.NET Core setup: Create a service class that encapsulates the logic, register it with dependency injection, and inject it into your controllers or minimal API endpoints. The built-in DI container handles lifecycle management.

Entity Framework Core: For database interactions, EF Core provides the ORM layer. Use migrations for schema management and raw SQL for performance-critical queries. Consider Dapper for read-heavy paths where EF Core's overhead matters.

Azure integration: If deploying to Azure, leverage managed services — Azure Cache for Redis, Azure SQL, Azure Service Bus, Azure Cosmos DB. These eliminate operational overhead and provide built-in monitoring through Application Insights.

Real-world companies using Design TikTok in production systems
Real-world examples of Design TikTok

Testing: Use xUnit with Testcontainers for integration tests that spin up real databases in Docker. Mock external dependencies with NSubstitute. The WebApplicationFactory class lets you test your entire HTTP pipeline in-process.

Monitoring: Add Application Insights telemetry to track request latency, dependency calls, and custom metrics. Use structured logging with Serilog to make production debugging possible:

text
Log.Information("Processing order {OrderId} for {CustomerId}", orderId, customerId);

This gives you searchable, structured logs in Azure Monitor or Seq.

Comparison table for Design TikTok showing key metrics and tradeoffs
Comparing key aspects of Design TikTok

System-Specific Clarifying Questions

Before designing Tiktok, ask questions specific to THIS system:

Key components of Design TikTok with roles and responsibilities
Key components of Design TikTok
  1. Who are the primary users? Understanding the user base shapes every technical decision — consumer apps have different requirements than enterprise B2B systems.
  2. What is the read-to-write ratio? This determines whether you optimize for fast reads (caching, denormalization) or fast writes (write-ahead logs, async processing).
  3. What is the geographic distribution? Users in one country vs. global users fundamentally changes your data replication and CDN strategy.
  4. What is the acceptable latency? Some features need sub-100ms responses, others can tolerate seconds. This determines your caching and architecture strategy.
  5. What is the consistency requirement? Some data (payments, inventory) needs strong consistency. Other data (social feeds, recommendations) can be eventually consistent.

Architecture Deep Dive

The architecture for Tiktok should be designed around the specific access patterns of the system. Do not apply generic templates — every system has unique hotspots, bottlenecks, and scaling challenges.

Write Path: How does data enter the system? Is it bursty (event-driven, flash sales) or steady (sensor data, logs)? Bursty writes need queuing and backpressure. Steady writes can go directly to the database.

Read Path: How is data consumed? Is it fan-out (one write, many reads like social feeds) or point lookups (one read for specific data like user profiles)? Fan-out reads benefit from pre-computation and caching. Point lookups benefit from efficient indexing.

Hot Spots: Where are the bottlenecks? For Tiktok, identify the component that will fail first under load and design mitigation strategies: caching, sharding, rate limiting, or async processing.

Sources