Content Delivery Networks (CDN) in Computer Networks - Time & Space Complexity
When studying CDNs, it is important to understand how the time to serve content changes based on the number of users, edge servers, and cached items.
We want to know how cache lookup and content retrieval scale as CDN infrastructure grows.
Analyze the time complexity of the following simplified CDN cache lookup process.
for each incoming_request in requests:
cache_key = hash(request.url)
if cache_key in edge_cache:
return edge_cache[cache_key]
else:
content = fetch_from_origin(request.url)
edge_cache[cache_key] = content
return content
This code processes each user request by computing a hash of the URL, checking the edge cache, and either returning cached content or fetching from the origin.
Look at the operations performed for each request.
- Primary operation: Hash computation and cache lookup for each request.
- How many times: Once per incoming request. The cache lookup itself is a hash table operation.
As the number of requests grows, the total work increases linearly, but each individual lookup remains constant time.
| Input Size (n requests) | Cache Lookups | Origin Fetches (worst case) |
|---|---|---|
| 100 requests | 100 (O(1) each) | 100 (all unique URLs) |
| 1,000 requests | 1,000 (O(1) each) | ~200 (most are cache hits) |
| 10,000 requests | 10,000 (O(1) each) | ~500 (high cache hit rate) |
Pattern observation: Cache lookups scale linearly with requests, but origin fetches grow much slower because repeated URLs are served from cache. The cache hit ratio improves as more content gets cached.
Time Complexity: O(n) for processing n requests
Each request requires O(1) for the hash computation and cache lookup. Processing n total requests takes O(n) time. The key insight is that origin fetches (the expensive operation) decrease as the cache warms up.
[X] Wrong: "Every request requires fetching from the origin server, so the time grows with both requests and content size."
[OK] Correct: The entire purpose of a CDN cache is to avoid origin fetches. After the first request for a URL caches the content, all subsequent requests for the same URL are O(1) cache hits. The origin fetch only happens once per unique URL.
Understanding CDN cache lookup complexity helps explain why CDNs can handle millions of requests per second. Each edge server performs O(1) lookups, and the distributed nature means total capacity scales linearly with the number of edge servers.
What if the cache has a fixed size and uses LRU eviction? How would the cache hit rate change for a CDN serving a long-tail distribution of URLs where a few URLs get most traffic?