Theoretical Foundations
Welcome to the curriculum workspace. Here you will find long-form technical guidelines outlining core architectural blueprints and implementation mechanics.
Module 4: Network Primitives & Traffic Management
PHASE 2 — SYSTEM DESIGN FOUNDATIONS: This module teaches you the network layer that your distributed services depend on. Every request in a microservice system crosses these primitives before reaching your application code. Understanding TCP handshake overhead, load balancing algorithms, and TLS termination is essential before designing event-driven systems (Module 9) and fault-tolerant services (Module 14).
Introduction: The Network Is Not Reliable
Peter Deutsch's Eight Fallacies of Distributed Computing (Module 7) begins with "The network is reliable." Engineers who believe this deploy systems that collapse under real-world conditions. Before you design your first distributed service, you must understand how data actually travels across a network.
When a user clicks "Place Order" on your frontend, the following occurs before a single line of your application code runs:
[Browser]
| DNS resolution: macropatternsconsortium.com → 104.21.33.82 (~20ms on first lookup)
v
[Anycast DNS Network]
| TCP 3-way handshake with load balancer (~15ms RTT)
v
[TLS 1.3 Handshake] (~25ms for 1-RTT, 0ms for 0-RTT session resumption)
v
[Load Balancer: L7 HTTP routing]
| Header inspection → routes to /api/* backend pool
v
[Application Server]
| Application code begins executing
Total network overhead before your code runs: ~60ms minimum.
Section 1: The OSI Model — Which Layers Matter for Architects
The OSI (Open Systems Interconnection) model is a 7-layer framework describing how data moves through a network. As an architect, four layers are operationally relevant:
Layer 7: Application (HTTP, gRPC, WebSocket, DNS)
Layer 4: Transport (TCP, UDP)
Layer 3: Network (IP, ICMP, BGP routing)
Layer 2: Data Link (Ethernet, MAC addresses)
Layer 1: Physical (Cable, fiber, radio)
Layer 4 (Transport) is where you configure connection behavior, timeouts, and load balancing strategies. Layer 7 (Application) is where HTTP headers, URL paths, and cookies are inspected for intelligent routing decisions.
Section 2: TCP vs. UDP — The Reliability Trade-Off
A. TCP (Transmission Control Protocol)
TCP is a connection-oriented, reliable protocol. Before any data is transmitted, the client and server perform a 3-way handshake:
Client Server
|--- SYN (seq=1000) ----------->| "I want to connect"
|<-- SYN-ACK (seq=2000, ack=1001) | "OK, I'm ready"
|--- ACK (ack=2001) ----------->| "Connection established"
| |
|=== DATA TRANSFER ==============|
| |
|--- FIN ---------------------->| "Done, closing"
|<-- FIN-ACK -------------------|
TCP Handshake Cost: On a 100ms cross-region link, a TCP connection costs 200ms before a single byte of application data is sent. This is why connection pooling (reusing established TCP connections) is essential for database and HTTP performance.
TCP Flow Control and Congestion Control
TCP uses a sliding window mechanism to control how much data can be in flight: $$\text{Throughput} = \frac{\text{CWND}}{RTT}$$ Where $\text{CWND}$ is the congestion window size. On a new connection, TCP starts with a small window (slow start) and doubles it each RTT until detecting packet loss. This means a TCP connection cannot reach its maximum throughput instantly — it requires several RTTs to "warm up."
Implications for Architects:
- Never close and re-open database connections for each request. Use connection pools (pgBouncer, HikariCP).
- HTTP/1.1
keep-alivereuses TCP connections across multiple HTTP requests. - HTTP/2 multiplexes multiple requests over a single TCP connection.
B. UDP (User Datagram Protocol)
UDP is connectionless and unreliable. It sends data without a handshake and provides no guarantee of delivery, ordering, or de-duplication.
Client Server
|--- DATA (no handshake) ------>| "Fire and forget"
Use cases for UDP:
- DNS: A single query-response fits in one datagram; retransmission overhead would be wasteful.
- Video Streaming / WebRTC: Occasional packet loss is preferable to retransmission stalls causing video freezing.
- Online Games: Real-time positional updates require low latency; stale position data is useless.
- QUIC Protocol (HTTP/3): Google's QUIC protocol is UDP-based but implements its own reliability, ordering, and encryption layers.
| Feature | TCP | UDP |
|---|---|---|
| Connection | 3-way handshake required | No handshake |
| Reliability | Guaranteed delivery, ordered | No guarantee |
| Speed | Higher latency (handshake + ACKs) | Lower latency |
| Use Case | HTTP, databases, file transfer | DNS, video, gaming, QUIC |
| Head-of-line blocking | Yes (single stream) | No |
Section 3: HTTP Evolution — HTTP/1.1, HTTP/2, HTTP/3
A. HTTP/1.1
HTTP/1.1 introduced keep-alive (persistent connections), but each request/response pair must complete before the next begins on a given connection (Head-of-Line Blocking):
Connection 1: [GET /api/user ---------> RESPONSE] [GET /api/orders ------> RESPONSE]
Connection 2: [GET /static/app.js ----> RESPONSE]
Connection 3: [GET /static/style.css -> RESPONSE]
Browsers open 6 parallel TCP connections per domain to work around this limitation — a hack that wastes resources.
B. HTTP/2
HTTP/2 solves head-of-line blocking at the HTTP layer using stream multiplexing — multiple requests fly concurrently over a single TCP connection:
Single TCP Connection:
Stream 1: |=== GET /api/user =====>|======= RESPONSE ==|
Stream 2: |======= GET /api/orders ======>|=== RESPONSE |
Stream 3: |== GET /static/app.js =================>|== RESPONSE
HTTP/2 Key Features:
- Header Compression (HPACK): HTTP headers are compressed using a static and dynamic Huffman table. A typical 500-byte HTTP/1.1 header block compresses to ~50 bytes.
- Server Push: The server can proactively send resources (e.g., CSS and JS files) before the client requests them.
- Binary Framing: Requests are binary-encoded frames, not plain-text strings — faster to parse.
The remaining problem: TCP-level head-of-line blocking. A single lost TCP packet causes all HTTP/2 streams to stall until the packet is retransmitted.
C. HTTP/3 (QUIC)
HTTP/3 replaces TCP with QUIC — a UDP-based transport protocol developed by Google:
HTTP/1.1: Text headers → TCP → IP
HTTP/2: Binary frames → TCP → IP (HOL blocking at TCP level)
HTTP/3: Binary frames → QUIC (UDP-based) → IP (No HOL blocking)
QUIC Key Innovations:
- 0-RTT Connection Establishment: For returning users, QUIC completes TLS + connection in 0 additional RTTs (the handshake parameters are cached from the prior session).
- Independent Streams: Packet loss on Stream 1 does not block Stream 2 from delivering data.
- Connection Migration: If a mobile user switches from WiFi to cellular, the QUIC connection migrates via a connection ID without re-establishing the session.
Section 4: DNS Mechanics — Routing at the Network Edge
A. DNS Resolution Flow
[Browser] ---> "Where is macropatternsconsortium.com?"
|
[OS Cache] (TTL 60s)
|
[Recursive Resolver] (ISP/8.8.8.8)
|
[Root Name Server]
| "Ask .com TLD"
[.com TLD Server]
| "Ask Cloudflare NS"
[Cloudflare Authoritative NS]
|
104.21.33.82 (A record)
TTL (Time-To-Live): DNS records carry a TTL value (in seconds). Lower TTL = faster propagation of IP changes (useful during failovers) but higher DNS query load. Higher TTL = less DNS traffic but slower failover.
| TTL Setting | Propagation Speed | DNS Query Load | Use Case |
|---|---|---|---|
| 60 seconds | Very fast (<2 min) | Very high | Active failover, blue/green deploys |
| 300 seconds | Fast (<10 min) | High | Standard API services |
| 3600 seconds | Slow (~1 hour) | Low | Static CDN assets |
| 86400 seconds | Very slow (24h) | Very low | Long-lived records |
B. Anycast DNS
Anycast assigns the same IP address to multiple servers in different geographic locations. BGP routing directs each user's DNS query to the nearest server:
User in Tokyo: DNS query → routes to Asia PoP (10ms RTT)
User in London: DNS query → routes to EU PoP (8ms RTT)
User in Chicago: DNS query → routes to US PoP (5ms RTT)
All three PoPs share the IP address 1.1.1.1
Cloudflare's 1.1.1.1 DNS resolver and their CDN network use Anycast. 200+ PoPs worldwide share the same IP.
C. GeoDNS Routing
GeoDNS returns different A records based on the geographic origin of the DNS query. Used to route users to the nearest data center:
User in Tokyo: Resolves macropatternsconsortium.com → 35.187.200.X (Asia-Pacific)
User in London: Resolves macropatternsconsortium.com → 34.118.100.X (Europe)
User in Dallas: Resolves macropatternsconsortium.com → 104.21.33.X (US-East)
Trade-off: GeoDNS routing is not instantaneous. If the Asia-Pacific region fails and you update the DNS record to point Tokyo users to US-East, users with a 3600s TTL cached record will continue hitting the failed region for up to 1 hour.
Section 5: Load Balancing Algorithms
A load balancer distributes incoming requests across a pool of backend servers. The algorithm determines which server receives each request.
Algorithm Comparison
1. Round Robin Requests are distributed sequentially across servers, cycling through the pool:
Request 1 → Server A
Request 2 → Server B
Request 3 → Server C
Request 4 → Server A (cycle restarts)
Problem: Servers with varying capacities get the same number of requests. A 4-core server gets as many requests as a 32-core server.
2. Weighted Round Robin Servers are assigned weights reflecting their capacity:
Server A: weight=1 (4 cores) → 1 request per cycle
Server B: weight=4 (16 cores) → 4 requests per cycle
Server C: weight=8 (32 cores) → 8 requests per cycle
Formula: Server $i$ receives $\frac{w_i}{\sum_{j} w_j}$ fraction of total traffic.
3. Least Connections Each request is routed to the server with the fewest active connections:
Server A: 45 active connections
Server B: 12 active connections ← New request goes here
Server C: 38 active connections
Best for: Workloads with highly variable request durations (e.g., API endpoints that range from 10ms to 10s).
4. IP Hash The client IP address is hashed to determine the server: $$\text{Server} = \text{hash}(\text{client_IP}) \bmod N$$ This ensures the same client always reaches the same server (session affinity / sticky sessions). Problem: If a server crashes, all clients pinned to it lose their session.
5. Least Response Time Routes to the server with the lowest combination of active connections and average response time: $$\text{Score}(s) = \frac{\text{ActiveConnections}(s)}{\text{AvgResponseTime}(s)}$$ Select server with minimum score. Used in high-performance L7 proxies like Envoy.
Section 6: Layer 4 vs. Layer 7 Load Balancing
Layer 4 (Transport Layer)
L4 load balancers forward TCP/UDP packets without inspecting application content:
[Client TCP Packet] → L4 LB → forwards to Backend (no content inspection)
- Pros: Extremely fast (no packet inspection overhead), supports any TCP/UDP protocol.
- Cons: Cannot route based on URL paths, HTTP headers, or cookies.
- Examples: AWS NLB, HAProxy (L4 mode), Linux LVS.
Layer 7 (Application Layer)
L7 load balancers terminate the TCP connection, inspect HTTP content, and make routing decisions:
[Client HTTP Request]
|
[L7 Load Balancer]
|
Inspect: URL path, Host header, Cookie, Authorization header
|
/api/* ──────────────> [API Backend Pool]
/static/* ───────────> [CDN / Static Asset Server]
/admin/* + auth ─────> [Admin Backend Pool]
- Pros: Path-based routing, header manipulation, SSL termination, circuit breaking, rate limiting.
- Cons: Higher latency per connection (inspection overhead), more complex, terminates TLS so must manage certificates.
- Examples: NGINX, Envoy, AWS ALB, Traefik.
Section 7: SSL/TLS Termination
TLS Termination at the Load Balancer
[Client] ←— HTTPS (TLS 1.3) —→ [Load Balancer] ←— HTTP (Plaintext) —→ [Backend Servers]
The load balancer holds the TLS certificate and private key. It decrypts inbound traffic and forwards plaintext HTTP to backends over the internal VPC network. Backends do not manage certificates.
Advantages:
- Offloads CPU-intensive asymmetric encryption (RSA/ECDSA key exchange) from application servers.
- Centralized certificate management (one certificate to renew, not hundreds).
- Enables L7 inspection (headers are now readable).
NGINX TLS Configuration (Hardened Production)
upstream backend_servers {
least_conn;
server app-1.internal:3000 weight=4;
server app-2.internal:3000 weight=4;
server app-3.internal:3000 weight=2;
keepalive 32; # Maintain 32 idle upstream connections
}
server {
listen 443 ssl http2;
server_name macropatternsconsortium.com;
# Certificate management (Let's Encrypt / Certbot)
ssl_certificate /etc/letsencrypt/live/macropatternsconsortium.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/macropatternsconsortium.com/privkey.pem;
# TLS hardening: only TLS 1.2+ with strong ciphers
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
ssl_prefer_server_ciphers on;
# HSTS: force HTTPS for 2 years
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload";
# Session resumption: reduce TLS handshake overhead for returning clients
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 1d;
# Proxy to upstream
location /api/ {
proxy_pass http://backend_servers;
proxy_http_version 1.1;
proxy_set_header Connection ""; # Enable HTTP/1.1 keepalive upstream
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Timeouts
proxy_connect_timeout 5s;
proxy_read_timeout 30s;
proxy_send_timeout 30s;
}
location /static/ {
root /var/www/static;
expires 1y;
add_header Cache-Control "public, immutable";
}
}
# Redirect HTTP to HTTPS
server {
listen 80;
server_name macropatternsconsortium.com;
return 301 https://$host$request_uri;
}
Section 8: Connection Overhead Calculations
TCP Handshake Latency Formula
For a round-trip time of $RTT$: $$\text{Total Latency (TCP + HTTP)} = \text{DNS} + 1.5 \times RTT_{\text{TCP}} + RTT_{\text{TLS}} + RTT_{\text{HTTP}}$$
For a US-to-Europe request ($RTT = 80ms$):
| Protocol | Additional RTTs | Additional Latency |
|---|---|---|
| TCP Handshake | 1.5 × RTT | 120ms |
| TLS 1.2 | 2 × RTT | 160ms |
| TLS 1.3 | 1 × RTT | 80ms |
| HTTP Request | 1 × RTT | 80ms |
| Total (TLS 1.3) | ~280ms + DNS |
For returning users with TLS 1.3 0-RTT Session Resumption: $$\text{TLS 0-RTT Latency} = 0 \times RTT = 0\text{ms additional}$$
Section 9: Hands-On Practice Challenge
The Challenge
The current topology shows a single client communicating directly with a single web server over plain HTTP. This architecture has zero redundancy and no TLS protection.
graph TD
Client[Client Browser] -->|HTTP| WebServer[Single Web Server]
Your Goal
Redesign this diagram to show a high-availability, TLS-terminated, load-balanced architecture:
- Introduce a
LoadBalancernode between the client and the backend pool. - Label the
Client → LoadBalancerconnection asHTTPS (TLS Termination). - Scale the backend to two redundant nodes:
WebServer1andWebServer2. - Label backend connections as
HTTP (Internal). - Show backend servers connecting to a shared
Databasetier.
Solution Model
graph TD
Client[Client Browser]
LB[NGINX Load Balancer\nSSL Termination]
WS1[Web Server 1\napp-1.internal]
WS2[Web Server 2\napp-2.internal]
DB[(PostgreSQL\nDatabase)]
Client -->|HTTPS - TLS 1.3| LB
LB -->|HTTP - Internal| WS1
LB -->|HTTP - Internal| WS2
WS1 -->|SQL| DB
WS2 -->|SQL| DB
Key architectural decisions in this diagram:
- TLS terminates at the load balancer — backends are never exposed to the internet.
- Two backend servers provide redundancy — if WS1 fails, the load balancer routes all traffic to WS2.
- Both servers share the same database, requiring stateless application design (no in-memory session state).
Bridge to Module 5: Now that traffic reaches your servers reliably, you need to store and retrieve data efficiently. Module 5 (Storage Paradigms & Database Mechanics) covers the internals of how databases store, index, and replicate your data — and when to choose SQL vs. NoSQL, single-leader vs. leaderless replication, and row-level vs. hash-based sharding.