0
0
LangChainframework~10 mins

LangServe for API deployment in LangChain - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - LangServe for API deployment
Define LangChain Model
Create FastAPI App
Add Routes with LangServe
Start API Server
Receive API Request
Process Request with Model
Send Response Back
This flow shows how you define a model, create a FastAPI app, add LangServe routes for the model, start the server, and handle API requests.
Execution Sample
LangChain
from fastapi import FastAPI
from langserve import add_routes
from langchain.chat_models import ChatOpenAI
import uvicorn

model = ChatOpenAI()
app = FastAPI()
add_routes(app, model, path="/chat")
uvicorn.run(app, host="localhost", port=8000)
This code sets up a ChatOpenAI model, creates a FastAPI app, adds LangServe routes, and starts the API server with uvicorn.
Execution Table
StepActionState ChangeResult
1Import FastAPI, add_routes, ChatOpenAI, uvicornModules readyNo output
2Create ChatOpenAI model instancemodel variable setModel ready to use
3Create FastAPI app instanceapp variable setApp ready to add routes
4add_routes(app, model, path="/chat")app now has routesApp can serve model via API
5Start API server with uvicorn.run(app)Server runningAPI endpoint available at http://localhost:8000/chat/invoke
6Receive API requestRequest receivedReady to process
7Process request using modelModel generates responseResponse ready
8Send response back to clientResponse sentClient receives answer
9No more requests or server stoppedServer stopsAPI unavailable
💡 Server stops when manually stopped or no requests remain
Variable Tracker
VariableStartAfter Step 2After Step 3After Step 4After Step 5Final
modelNoneChatOpenAI instanceChatOpenAI instanceChatOpenAI instanceChatOpenAI instanceChatOpenAI instance
appNoneNoneFastAPI instanceFastAPI with LangServe routesServer runningServer stopped or running
Key Moments - 3 Insights
Why do we need to call add_routes(app, model) before serving?
Because it registers the model's invoke/call methods as API endpoints on the FastAPI app, as shown in step 4 of the execution_table.
What happens if we call uvicorn.run(app) without adding routes?
The server starts but has no endpoints for the model, so requests to /chat won't work (refer to step 5 and 7).
How does the API server handle multiple requests?
Each request triggers step 6 and 7 repeatedly, processing with the model via the registered routes and sending responses, as shown in the loop between steps 6 to 8.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, what is the state of 'app' after step 4?
AFastAPI with LangServe routes
BFastAPI instance without routes
CNone
DServer stopped
💡 Hint
Check the 'State Change' column at step 4 in execution_table
At which step does the API server start listening for requests?
AStep 7
BStep 3
CStep 5
DStep 2
💡 Hint
Look for 'Server running' in the 'State Change' column
If you skip add_routes(app, model), what will happen when a request arrives at /chat?
ARequest is processed normally
B404 error or no response generated
CServer crashes immediately
DServer refuses to start
💡 Hint
Refer to key_moments about missing routes and step 7 in execution_table
Concept Snapshot
LangServe API Deployment:
1. Import FastAPI, langserve.add_routes, langchain.chat_models.ChatOpenAI, uvicorn.
2. Create model instance.
3. Create FastAPI app.
4. add_routes(app, model, path="/chat").
5. Start server with uvicorn.run(app).
6. Server handles requests by passing them to model.
7. Responses sent back to clients.
Full Transcript
This visual execution shows how to deploy an API using LangServe with LangChain models. First, import FastAPI, add_routes from langserve, ChatOpenAI, and uvicorn. Create a ChatOpenAI model instance. Create a FastAPI app. Add routes with add_routes(app, model, path="/chat"). Start the server with uvicorn.run(app). When a request comes to /chat/invoke, the server passes it to the model, which processes it and returns a response. The server sends this back to the client. This repeats for each request until stopped.