System Design Interview Questions
Scaling is the process of expanding system capability to handle growing request volumes:
- Vertical Scaling (Scale Up): Adding raw compute resources (bigger CPU, more RAM, faster SSDs) to an existing single server.
- Pros: Simple to implement; no database replication or application alterations required.
- Cons: Hard hardware ceilings; single point of failure; costs scale exponentially at high specifications.
- Horizontal Scaling (Scale Out): Adding more server machines to the pool, distributing load across them.
- Pros: Practically infinite scaling capability; high redundancy and availability.
- Cons: Requires load balancers; application servers must be stateless; introduces distributed database consistency complexities.
Key Points
Scale Up vs Scale Out, Hardware Ceiling, Stateless application tier, Cost vs Capacity
Common Follow-ups
What is statelessness in horizontal scaling, and where is session state stored?
Requirements: - Generate a unique, short alias for a long URL. - Redirect short URL to the original URL. - High availability, low latency reads.
Core Components:
1. Hash Function: Use Base62 encoding (a-z, A-Z, 0-9) on a unique ID (e.g., from a distributed ID generator like Snowflake). A 7-character Base62 key yields 62^7 ≈ 3.5 trillion unique URLs.
2. Database: Use a key-value store like Redis for caching popular mappings, with PostgreSQL/Cassandra as persistent storage. Schema: (id, short_key, long_url, created_at).
3. API: POST /shorten with long_url → returns short_key. GET /{short_key} → HTTP 301 redirect to long_url.
4. Scaling: Use a CDN for geo-distributed redirects. Pre-compute keys in batches to avoid collisions.
Read/Write Ratio: ~100:1 reads to writes, so optimize for fast lookups.
Key Points
Base62 encoding, Write-once read-many, Cache-heavy (Redis), CDN for hot keys, 301 redirect
Common Follow-ups
How do you handle custom short URLs? How do you prevent one user from guessing another's URLs?
Requirements: - Send and receive messages in real-time. - Support one-on-one and group chats. - Messages must be reliably delivered and ordered.
Core Components:
1. WebSocket Connection: Maintain a persistent TCP connection between client and server for real-time bidirectional communication.
2. Chat Service: Stateless service that routes messages. Each message is stored and assigned a monotonically increasing sequence ID (per conversation) for ordering.
3. Message Store: Use a distributed database like Cassandra (wide-column, fast writes) or a time-series DB. Schema: (conversation_id, message_id, sender_id, content, timestamp).
4. Presence Service: Redis pub/sub or a heartbeat mechanism to track online/offline status.
5. Group Chat Fan-out: For small groups (<100), fan-out write to each member's inbox. For large groups, fan-out read (pull model) where members fetch new messages on login.
6. Delivery Semantics: At-least-once delivery with deduplication using message IDs on the client side.
7. End-to-End Encryption: Each message is encrypted on the sender's device and decrypted on the receiver's; the server only stores ciphertext.
Key Points
WebSocket, Fan-out strategy, Inbox/Outbox pattern, Sequence IDs for ordering, E2E encryption
Common Follow-ups
How would you handle multi-device sync? How do you deliver offline messages when a user comes back online?