Bird
Raised Fist0
HLDsystem_design~10 mins

Design a web crawler in HLD - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to specify the main component responsible for fetching web pages.

HLD
The web crawler starts with the [1] component to download web pages.
Drag options to blanks, or click blank then click option'
AFetcher
BIndexer
CURL Scheduler
DParser
Attempts:
3 left
💡 Hint
Common Mistakes
Confusing the Fetcher with the Parser or Scheduler.
2fill in blank
medium

Complete the code to identify the component that manages the list of URLs to visit.

HLD
The [1] manages the queue of URLs that the crawler needs to visit next.
Drag options to blanks, or click blank then click option'
AParser
BFetcher
CURL Scheduler
DIndexer
Attempts:
3 left
💡 Hint
Common Mistakes
Mixing up the Scheduler with the Fetcher or Parser.
3fill in blank
hard

Fix the error in the component that extracts links from downloaded pages.

HLD
The [1] extracts URLs and content from the fetched web pages.
Drag options to blanks, or click blank then click option'
AParser
BFetcher
CIndexer
DScheduler
Attempts:
3 left
💡 Hint
Common Mistakes
Confusing the Parser with the Fetcher or Indexer.
4fill in blank
hard

Fill both blanks to complete the description of the crawler's storage components.

HLD
The [1] stores the raw web pages, while the [2] stores the processed data for search.
Drag options to blanks, or click blank then click option'
AContent Repository
BURL Scheduler
CIndex
DFetcher
Attempts:
3 left
💡 Hint
Common Mistakes
Mixing storage roles or confusing with fetching components.
5fill in blank
hard

Fill all three blanks to complete the request flow in the crawler system.

HLD
First, the [1] selects a URL, then the [2] downloads the page, and finally the [3] extracts links and data.
Drag options to blanks, or click blank then click option'
AURL Scheduler
BFetcher
CParser
DIndexer
Attempts:
3 left
💡 Hint
Common Mistakes
Mixing the order or confusing components.