Challenge - 5 Problems
Semantic Chunking Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate2:00remaining
Understanding Semantic Chunking Purpose
What is the main advantage of using semantic chunking in LangChain when processing large documents?
Attempts:
2 left
💡 Hint
Think about why preserving meaning matters when splitting text.
✗ Incorrect
Semantic chunking groups text by meaning, which helps maintain context and improves the quality of information retrieval in LangChain.
❓ component_behavior
intermediate2:00remaining
Behavior of RecursiveCharacterTextSplitter
Given the RecursiveCharacterTextSplitter in LangChain, what happens when it encounters a large text with nested sections?
LangChain
from langchain.text_splitter import RecursiveCharacterTextSplitter text = '''Section 1\nThis is some text.\n\nSection 2\nMore detailed text here.\n\nSection 3\nFinal notes.''' splitter = RecursiveCharacterTextSplitter(chunk_size=50, chunk_overlap=10) chunks = splitter.split_text(text) print(len(chunks))
Attempts:
2 left
💡 Hint
Consider how recursive splitting works with separators.
✗ Incorrect
RecursiveCharacterTextSplitter splits text by trying larger separators first (like paragraphs), then smaller ones (like newlines), ensuring chunks respect chunk_size and overlap.
📝 Syntax
advanced2:00remaining
Correct Usage of CharacterTextSplitter Parameters
Which option correctly initializes a CharacterTextSplitter to create chunks of 100 characters with 20 characters overlap and splits on spaces?
Attempts:
2 left
💡 Hint
Check which separator splits on spaces.
✗ Incorrect
Using separator=' ' splits text on spaces, which is correct for word-based chunking with specified chunk size and overlap.
🔧 Debug
advanced2:00remaining
Identifying Error in Custom Chunking Function
What error will this custom chunking function raise when used in LangChain, and why?
```python
def custom_chunker(text):
chunks = []
for i in range(0, len(text), 50):
chunks.append(text[i:i+50])
return chunks
splitter = RecursiveCharacterTextSplitter(chunk_size=50, chunk_overlap=10)
splitter.split_text = custom_chunker
chunks = splitter.split_text('Hello world! This is a test of custom chunking.')
print(len(chunks))
```
Attempts:
2 left
💡 Hint
Consider what split_text method signature is and if custom_chunker matches it.
✗ Incorrect
split_text is an instance method (def split_text(self, text)), but custom_chunker takes only text. Calling splitter.split_text(text) passes self, causing TypeError: custom_chunker takes 1 argument but 2 given.
❓ state_output
expert3:00remaining
Output of Chunking with Overlap in LangChain
What is the output list of chunks when splitting the text 'abcdefghij' with chunk_size=4 and chunk_overlap=2 using CharacterTextSplitter with separator=None?
LangChain
from langchain.text_splitter import CharacterTextSplitter text = 'abcdefghij' splitter = CharacterTextSplitter(chunk_size=4, chunk_overlap=2, separator=None) chunks = splitter.split_text(text) print(chunks)
Attempts:
2 left
💡 Hint
Remember overlap means the next chunk starts chunk_size - chunk_overlap characters after the previous chunk start.
✗ Incorrect
With chunk_size=4 and chunk_overlap=2, chunks start every 2 characters: 'abcd', 'cdef', 'efgh', 'ghij'.