Performance: Context formatting and injection
MEDIUM IMPACT
This concept affects how quickly the language model can generate responses by controlling prompt size and complexity, impacting initial load and interaction speed.
formatted_context = format_context(large_documents, max_tokens=500) prompt = f"Answer based on context: {formatted_context}" response = llm.generate(prompt)
context = "".join(large_documents) prompt = f"Answer based on context: {context}" response = llm.generate(prompt)
| Pattern | Prompt Size | Token Count | Response Latency | Verdict |
|---|---|---|---|---|
| Inject full raw context | Large (many KB) | High (1000+ tokens) | Slow (seconds delay) | [X] Bad |
| Inject formatted, truncated context | Small (few KB) | Low (few hundred tokens) | Fast (sub-second delay) | [OK] Good |