Bird
0
0

To split a lengthy document into 100-character chunks with 20-character overlaps while preserving paragraph boundaries as much as possible, which separators list should you use in RecursiveCharacterTextSplitter?

hard📝 component behavior Q8 of 15
LangChain - Text Splitting
To split a lengthy document into 100-character chunks with 20-character overlaps while preserving paragraph boundaries as much as possible, which separators list should you use in RecursiveCharacterTextSplitter?
A[" ", ". ", "\n", "\n\n"]
B[". ", "\n", "\n\n", " "]
C["\n\n", "\n", ". ", " "]
D["\n", " ", ". ", "\n\n"]
Step-by-Step Solution
Solution:
  1. Step 1: Understand separator priority

    To preserve paragraphs, split first by double newlines ("\n\n").
  2. Step 2: Next separators

    Then split by single newline ("\n"), then sentences (". "), then words (" ").
  3. Step 3: Match the correct order

    ["\n\n", "\n", ". ", " "] matches this order exactly.
  4. Final Answer:

    ["\n\n", "\n", ". ", " "] -> Option C
  5. Quick Check:

    Check if paragraphs split first, then sentences, then words [OK]
Quick Trick: Order separators from largest to smallest text unit [OK]
Common Mistakes:
  • Reversing the order of separators
  • Ignoring paragraph preservation

Want More Practice?

15+ quiz questions · All difficulty levels · Free

Free Signup - Practice All Questions
More LangChain Quizzes