Challenge - 5 Problems

🎖️

Semantic Chunking Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

2:00remaining

Understanding Semantic Chunking Purpose

What is the main advantage of using semantic chunking in LangChain when processing large documents?

AIt compresses text chunks to reduce storage space.

BIt groups text based on meaning, preserving context for better retrieval.

CIt splits text into fixed-size chunks regardless of meaning, improving speed.

DIt translates text chunks into multiple languages automatically.

Attempts:

2 left

❓ component_behavior

intermediate

2:00remaining

Behavior of RecursiveCharacterTextSplitter

Given the RecursiveCharacterTextSplitter in LangChain, what happens when it encounters a large text with nested sections?

LangChain

from langchain.text_splitter import RecursiveCharacterTextSplitter

text = '''Section 1\nThis is some text.\n\nSection 2\nMore detailed text here.\n\nSection 3\nFinal notes.''' 
splitter = RecursiveCharacterTextSplitter(chunk_size=50, chunk_overlap=10)
chunks = splitter.split_text(text)
print(len(chunks))

AIt splits text recursively by separators like newlines and spaces, producing meaningful chunks under 50 characters.

BIt splits the text into chunks of exactly 50 characters each, ignoring sections.

CIt returns the entire text as one chunk because chunk_size is too large.

DIt raises a TypeError because chunk_overlap cannot be set.

Attempts:

2 left

📝 Syntax

advanced

2:00remaining

Correct Usage of CharacterTextSplitter Parameters

Which option correctly initializes a CharacterTextSplitter to create chunks of 100 characters with 20 characters overlap and splits on spaces?

ACharacterTextSplitter(chunk_size=100, chunk_overlap=20, separator=' ')

BCharacterTextSplitter(chunk_size=100, chunk_overlap=20, separator='\n')

CCharacterTextSplitter(chunk_size=100, chunk_overlap=20, separator=None)

DCharacterTextSplitter(chunk_size=100, chunk_overlap=20, separator='')

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Identifying Error in Custom Chunking Function

What error will this custom chunking function raise when used in LangChain, and why? ```python def custom_chunker(text): chunks = [] for i in range(0, len(text), 50): chunks.append(text[i:i+50]) return chunks splitter = RecursiveCharacterTextSplitter(chunk_size=50, chunk_overlap=10) splitter.split_text = custom_chunker chunks = splitter.split_text('Hello world! This is a test of custom chunking.') print(len(chunks)) ```

AValueError because chunk_overlap is not handled in custom_chunker.

BAttributeError because split_text cannot be reassigned.

CNo error; prints the number of chunks correctly.

DTypeError because split_text expects a method with different signature.

Attempts:

2 left

❓ state_output

expert

3:00remaining

Output of Chunking with Overlap in LangChain

What is the output list of chunks when splitting the text 'abcdefghij' with chunk_size=4 and chunk_overlap=2 using CharacterTextSplitter with separator=None?

LangChain

from langchain.text_splitter import CharacterTextSplitter

text = 'abcdefghij'
splitter = CharacterTextSplitter(chunk_size=4, chunk_overlap=2, separator=None)
chunks = splitter.split_text(text)
print(chunks)

A['abcd', 'cdef', 'efgh', 'fghi', 'ghij']

B['abcd', 'bcde', 'cdef', 'defg', 'efgh', 'fghi', 'ghij']

C['abcd', 'cdef', 'efgh', 'ghij']

D['abcd', 'bcde', 'cdef', 'efgh', 'ghij']

Attempts:

2 left