Recall & Review
beginner
What is text chunking in natural language processing?
Text chunking is the process of dividing text into smaller, meaningful pieces called chunks, such as phrases or sentences, to make it easier for machines to understand and analyze.
Click to reveal answer
beginner
Name two common strategies used for text chunking.
Two common strategies are:<br>1. Fixed-size chunking: splitting text into equal-sized parts.<br>2. Semantic chunking: splitting text based on meaning, like sentences or phrases.
Click to reveal answer
intermediate
Why is semantic chunking often better than fixed-size chunking?
Semantic chunking respects the meaning and structure of text, like sentences or paragraphs, which helps models understand context better than arbitrary fixed-size chunks.
Click to reveal answer
intermediate
What is a challenge when using fixed-size chunking?
Fixed-size chunking can split sentences or ideas in the middle, causing loss of meaning and making it harder for models to understand the text properly.
Click to reveal answer
advanced
How can overlapping chunks improve text chunking?
Overlapping chunks include some shared text between chunks, which helps preserve context across chunks and reduces information loss at chunk boundaries.
Click to reveal answer
What does text chunking help with in machine learning?
✗ Incorrect
Text chunking breaks text into smaller parts to help machines understand and analyze it better.
Which chunking strategy respects sentence boundaries?
✗ Incorrect
Semantic chunking splits text based on meaning, like sentences, preserving natural boundaries.
What is a downside of fixed-size chunking?
✗ Incorrect
Fixed-size chunking may cut sentences arbitrarily, losing meaning.
Why use overlapping chunks?
✗ Incorrect
Overlapping chunks share text to keep context between chunks.
Which is NOT a text chunking strategy?
✗ Incorrect
Image chunking is unrelated to text chunking.
Explain what text chunking is and why it is useful in natural language processing.
Think about how breaking text into parts helps computers.
You got /3 concepts.
Describe the difference between fixed-size chunking and semantic chunking, including one advantage and one disadvantage of each.
Consider how chunks are created and how meaning is preserved.
You got /3 concepts.