Bird
0
0

Why does PyPDFLoader return a list of documents instead of a single string when loading a PDF?

hard📝 Conceptual Q10 of 15
LangChain - Document Loading
Why does PyPDFLoader return a list of documents instead of a single string when loading a PDF?
ABecause PDFs cannot be converted to text
BBecause it treats each PDF page as a separate document chunk
CBecause it loads only the first page by default
DBecause it merges all pages into one document internally
Step-by-Step Solution
Solution:
  1. Step 1: Understand PyPDFLoader's internal design

    It splits the PDF into chunks by page, returning a list of document objects.
  2. Step 2: Evaluate other options

    PDFs can be converted to text, it loads all pages, and does not merge internally.
  3. Final Answer:

    Because it treats each PDF page as a separate document chunk -> Option B
  4. Quick Check:

    Page-wise chunking explains list output [OK]
Quick Trick: Each page becomes a document chunk in the list [OK]
Common Mistakes:
  • Thinking PDFs can't be converted to text
  • Assuming only first page loads
  • Believing PyPDFLoader merges pages internally

Want More Practice?

15+ quiz questions · All difficulty levels · Free

Free Signup - Practice All Questions
More LangChain Quizzes