Performance: Why chunk size affects retrieval quality
MEDIUM IMPACT
This concept impacts how quickly and accurately relevant information is retrieved from documents, affecting user wait time and result relevance.
chunk_size = 200 # Moderate chunk size retrieved_chunks = retriever.get_relevant_chunks(query, chunk_size=chunk_size)
chunk_size = 1000 # Very large chunks retrieved_chunks = retriever.get_relevant_chunks(query, chunk_size=chunk_size)
| Pattern | Chunks Processed | Processing Time | Retrieval Precision | Verdict |
|---|---|---|---|---|
| Large chunk size (1000 tokens) | Fewer chunks | Higher per chunk | Lower precision | [!] Bad |
| Moderate chunk size (200 tokens) | More chunks | Lower per chunk | Higher precision | [OK] Good |