Bird
0
0

Given this code snippet using langchain's semantic chunking:

medium📝 component behavior Q13 of 15
LangChain - Text Splitting
Given this code snippet using langchain's semantic chunking:
text = "Hello world. This is a test. Langchain helps chunk text."
chunks = chunk_text(text, chunk_size=12, chunk_overlap=4)
print(chunks)
What is the expected output?
ASyntaxError due to invalid chunking parameters
B["Hello world. This", "This is a test.", "Langchain helps chunk text."]
C["Hello", "world.", "This", "is", "a", "test."]
D["Hello world.", "world. This is", "This is a test.", "a test. Langchain", "Langchain helps", "helps chunk text."]
Step-by-Step Solution
Solution:
  1. Step 1: Understand chunk size and overlap effect

    Chunk size 12 means each chunk has up to 12 characters; overlap 4 means next chunk starts 4 characters before previous chunk ends.
  2. Step 2: Trace chunk creation on text

    Chunks will overlap by 4 characters, producing overlapping meaningful parts like "Hello world.", "world. This is", etc.
  3. Final Answer:

    ["Hello world.", "world. This is", "This is a test.", "a test. Langchain", "Langchain helps", "helps chunk text."] -> Option D
  4. Quick Check:

    Chunk size 12 + overlap 4 = overlapping meaningful chunks [OK]
Quick Trick: Overlap means next chunk starts before previous ends [OK]
Common Mistakes:
  • Ignoring overlap and making chunks non-overlapping
  • Splitting by words instead of characters
  • Assuming syntax error without checking parameters

Want More Practice?

15+ quiz questions · All difficulty levels · Free

Free Signup - Practice All Questions
More LangChain Quizzes