You want to load text from multiple web pages using WebBaseLoader. Which approach correctly loads texts from two URLs and combines their content into a single string?
hard📝 Application Q15 of 15
LangChain - Document Loading
You want to load text from multiple web pages using WebBaseLoader. Which approach correctly loads texts from two URLs and combines their content into a single string?
Aloader1 = WebBaseLoader("https://site1.com")
loader2 = WebBaseLoader("https://site2.com")
docs = loader1.load() + loader2.load()
combined_text = "".join(doc.page_content for doc in docs)
Step 1: Understand WebBaseLoader usage for multiple URLs
WebBaseLoader accepts one URL per instance. To load multiple pages, create separate loaders and load each.
Step 2: Combine loaded documents correctly
loader1 = WebBaseLoader("https://site1.com")
loader2 = WebBaseLoader("https://site2.com")
docs = loader1.load() + loader2.load()
combined_text = "".join(doc.page_content for doc in docs) creates two loaders, loads documents from each, concatenates the lists, then joins their text content into one string.
Step 3: Identify why other options fail
loader = WebBaseLoader(["https://site1.com", "https://site2.com"])
docs = loader.load()
combined_text = " ".join(docs.page_content) incorrectly passes a list of URLs to one loader (not supported). docs = WebBaseLoader.load("https://site1.com", "https://site2.com")
combined_text = " ".join(doc.page_content for doc in docs) calls load as a class method with multiple URLs (invalid). loader = WebBaseLoader("https://site1.com")
combined_text = loader.load("https://site2.com") tries to pass a URL to load() method (load() takes no arguments).
Final Answer:
loader1 = WebBaseLoader("https://site1.com")
loader2 = WebBaseLoader("https://site2.com")
docs = loader1.load() + loader2.load()
combined_text = "".join(doc.page_content for doc in docs) -> Option A
Quick Check:
One loader per URL, then combine texts = A [OK]
Quick Trick:Create one loader per URL, then merge results [OK]
Common Mistakes:
Passing multiple URLs to one WebBaseLoader instance
Calling load() with URL arguments
Trying to load multiple pages with one call
Master "Document Loading" in LangChain
9 interactive learning modes - each teaches the same concept differently