Caching saves previous results so you don't pay or wait for the same work twice. It helps reduce costs and speeds up your app.
0
0
Caching strategies for cost reduction in LangChain
Introduction
When you call an expensive API multiple times with the same input
When you want faster responses for repeated questions or tasks
When you want to limit usage of paid services to save money
When you build a chatbot that often repeats answers
When you want to avoid unnecessary processing for unchanged data
Syntax
LangChain
from langchain.cache import InMemoryCache cache = InMemoryCache() # Use cache to store and retrieve results result = cache.get_or_set(key, lambda: expensive_function(input))
Use get_or_set to check cache first, then run function if missing.
LangChain supports different cache backends like memory, Redis, or custom stores.
Examples
This caches the answer to 'What is AI?' so next time it returns instantly without calling the model.
LangChain
from langchain.cache import InMemoryCache cache = InMemoryCache() key = 'user_question_1' result = cache.get_or_set(key, lambda: llm.generate('What is AI?'))
This uses Redis to cache results, which works well for multiple app instances sharing the cache.
LangChain
from langchain.cache import RedisCache cache = RedisCache(redis_url='redis://localhost:6379') key = 'expensive_call' result = cache.get_or_set(key, lambda: expensive_api_call())
Sample Program
This example shows caching an expensive calculation. The first call runs the function and caches the result. The second call returns the cached result instantly without running the function again.
LangChain
from langchain.cache import InMemoryCache cache = InMemoryCache() # Simulate an expensive function def expensive_function(x): print('Running expensive function...') return x * x key = 'square_4' # First call runs the function result1 = cache.get_or_set(key, lambda: expensive_function(4)) print('First call result:', result1) # Second call uses cache, no print from function result2 = cache.get_or_set(key, lambda: expensive_function(4)) print('Second call result:', result2)
OutputSuccess
Important Notes
Cache keys should be unique and descriptive to avoid collisions.
Cached data may become stale; consider cache expiration if needed.
Choose cache backend based on your app scale and persistence needs.
Summary
Caching stores results to avoid repeated work and reduce costs.
Use get_or_set to check cache before running expensive calls.
Pick the right cache type for your app, like in-memory or Redis.