Bird
Raised Fist0
Prompt Engineering / GenAIml~20 mins

Document loaders in Prompt Engineering / GenAI - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
🎖️
Document Loader Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
1:30remaining
What is the main purpose of a document loader in AI?

Imagine you want to teach a computer to understand many documents. What does a document loader do in this process?

AIt reads and converts documents into a format the AI can process.
BIt evaluates the AI model's accuracy on document tasks.
CIt generates new documents based on existing ones.
DIt trains the AI model directly on raw text data.
Attempts:
2 left
💡 Hint

Think about the first step before teaching the AI anything.

Predict Output
intermediate
2:00remaining
Output of loading documents with a simple loader

Given this code snippet that loads text files, what will be the output?

Prompt Engineering / GenAI
documents = ['doc1.txt', 'doc2.txt']
loaded = [open(doc).read() for doc in documents]
print(len(loaded))
A0
B2
CError: File not found
D1
Attempts:
2 left
💡 Hint

How many files are being read?

Model Choice
advanced
2:00remaining
Choosing the best document loader for PDFs

You want to load text from PDF files for your AI project. Which document loader is best suited?

AA loader that extracts text from PDF structure and layout.
BA loader that reads plain text files line by line.
CA loader that loads audio files as text.
DA loader that only reads images without text extraction.
Attempts:
2 left
💡 Hint

PDFs have special formatting and structure.

Hyperparameter
advanced
2:00remaining
Effect of chunk size in document loading

When loading large documents, you can split them into chunks. What happens if you set chunk size too small?

AThe AI ignores the chunks and reads the whole document at once.
BThe AI merges chunks into one large text automatically.
CThe AI processes many small pieces, which may slow down training.
DThe AI crashes because chunk size must be large.
Attempts:
2 left
💡 Hint

Think about how many pieces the AI must handle.

🔧 Debug
expert
2:30remaining
Why does this document loader code raise an error?

Look at this code snippet and find why it raises an error:

docs = ['file1.txt', 'file2.txt']
loaded_docs = []
for doc in docs:
    with open(doc, 'r') as f
        loaded_docs.append(f.read())
print(len(loaded_docs))
AThe print statement is outside the loop causing indentation error.
BThe list 'docs' is empty, so no files to read.
CThe variable 'loaded_docs' is not defined before appending.
DMissing colon ':' after 'with open(doc, 'r') as f' causes SyntaxError.
Attempts:
2 left
💡 Hint

Check the syntax of the 'with' statement.

Practice

(1/5)
1. What is the main purpose of a document loader in AI applications?
easy
A. To visualize data in charts and graphs
B. To train AI models directly from raw data
C. To read files and convert their content into a format machines can understand
D. To compress files for storage

Solution

  1. Step 1: Understand the role of document loaders

    Document loaders are designed to read files and extract their content in a way that machines can process.
  2. Step 2: Differentiate from other tasks

    Training models or visualizing data are separate steps after loading the data.
  3. Final Answer:

    To read files and convert their content into a format machines can understand -> Option C
  4. Quick Check:

    Document loader = read and convert files [OK]
Hint: Remember: loaders prepare data, not train or visualize [OK]
Common Mistakes:
  • Confusing loading with training
  • Thinking loaders compress files
  • Assuming loaders create visualizations
2. Which of the following is the correct way to load a PDF file using a document loader in Python?
easy
A. loader = ImageLoader('file.pdf')
B. loader = PDFLoader('file.pdf')
C. loader = CSVLoader('file.pdf')
D. loader = TextLoader('file.pdf')

Solution

  1. Step 1: Identify the correct loader for PDF files

    PDFLoader is designed specifically to read PDF documents.
  2. Step 2: Check other loaders' purposes

    TextLoader is for plain text files, CSVLoader for CSV files, and ImageLoader for images, so they are incorrect for PDFs.
  3. Final Answer:

    loader = PDFLoader('file.pdf') -> Option B
  4. Quick Check:

    PDF file uses PDFLoader [OK]
Hint: Match loader type to file type exactly [OK]
Common Mistakes:
  • Using TextLoader for PDFs
  • Confusing CSVLoader with PDFLoader
  • Trying to load PDFs as images
3. Given the following Python code snippet, what will be the output type of documents after loading a text file?
from langchain.document_loaders import TextLoader
loader = TextLoader('sample.txt')
documents = loader.load()
medium
A. An integer representing file size
B. A single string with all text combined
C. A dictionary with file metadata
D. A list of Document objects containing the text content

Solution

  1. Step 1: Understand what TextLoader.load() returns

    The load() method returns a list of Document objects, each holding part or all of the file's text content.
  2. Step 2: Eliminate other options

    It does not return a single string, dictionary, or integer.
  3. Final Answer:

    A list of Document objects containing the text content -> Option D
  4. Quick Check:

    TextLoader.load() returns list of Documents [OK]
Hint: Loaders return lists of Documents, not raw strings [OK]
Common Mistakes:
  • Expecting a single string instead of list
  • Thinking output is metadata dictionary
  • Confusing output with file size
4. Identify the error in this code snippet for loading a PDF file:
from langchain.document_loaders import PDFLoader
loader = PDFLoader('document.txt')
docs = loader.load()
medium
A. The file extension does not match the loader type
B. Missing parentheses in load method
C. Incorrect import statement for PDFLoader
D. The variable name 'docs' is invalid

Solution

  1. Step 1: Check file name and loader compatibility

    PDFLoader expects a PDF file, but the file given is 'document.txt', a text file.
  2. Step 2: Verify other code parts

    Parentheses are correct, import is correct, and variable name is valid.
  3. Final Answer:

    The file extension does not match the loader type -> Option A
  4. Quick Check:

    Loader and file type must match [OK]
Hint: Match file extension to loader type to avoid errors [OK]
Common Mistakes:
  • Ignoring file extension mismatch
  • Thinking variable names cause errors
  • Assuming import is wrong without checking
5. You want to load multiple document types (PDF, TXT, CSV) for an AI model training pipeline. Which approach best handles this using document loaders?
hard
A. Use separate loaders for each file type and combine their outputs into one list
B. Use only TextLoader for all files regardless of type
C. Convert all files to images and use ImageLoader
D. Load only PDF files and ignore others

Solution

  1. Step 1: Understand file type differences

    Different file types require different loaders to correctly extract content.
  2. Step 2: Combine outputs for unified processing

    Using separate loaders and merging their outputs ensures all data is loaded properly for training.
  3. Final Answer:

    Use separate loaders for each file type and combine their outputs into one list -> Option A
  4. Quick Check:

    Different loaders + combine outputs = best practice [OK]
Hint: Use correct loader per file, then merge results [OK]
Common Mistakes:
  • Using one loader for all file types
  • Ignoring non-PDF files
  • Converting files unnecessarily to images