Performance: Few-shot prompt templates
MEDIUM IMPACT
This concept affects the speed of generating AI responses by controlling prompt size and complexity, impacting initial load and interaction responsiveness.
from langchain.prompts import PromptTemplate, FewShotPromptTemplate example_prompt = PromptTemplate( input_variables=["input", "output"], template="Question: {input}\nAnswer: {output}\n" ) examples = [ {"input": "What is AI?", "output": "AI is artificial intelligence."}, {"input": "Define ML.", "output": "ML is machine learning."} ] prompt = FewShotPromptTemplate( examples=examples, example_prompt=example_prompt, prefix="Answer the following questions:\n", suffix="Question: {input}", input_variables=["input"] )
from langchain.prompts import PromptTemplate, FewShotPromptTemplate example_prompt = PromptTemplate( input_variables=["input", "output"], template="Question: {input}\nAnswer: {output}\n" ) examples = [ {"input": "What is AI?", "output": "AI is artificial intelligence."}, {"input": "Define ML.", "output": "ML is machine learning."}, {"input": "Explain NLP.", "output": "NLP is natural language processing."}, {"input": "What is a neural network?", "output": "A neural network is a model inspired by the brain."} ] prompt = FewShotPromptTemplate( examples=examples, example_prompt=example_prompt, prefix="Answer the following questions:\n", suffix="Question: {input}", input_variables=["input"] )
| Pattern | Token Count | Inference Time | Latency Impact | Verdict |
|---|---|---|---|---|
| Many examples (4+) | High (~400 tokens) | Longer (~400ms) | Higher latency | [X] Bad |
| Few examples (2) | Medium (~200 tokens) | Shorter (~200ms) | Lower latency | [OK] Good |