How to Use PydanticOutputParser in Langchain for Structured Output
Use
PydanticOutputParser in Langchain by defining a Pydantic model for your expected output, then create the parser with that model. Pass this parser to your language model chain to automatically convert raw text outputs into validated Pydantic objects.Syntax
The PydanticOutputParser requires a Pydantic model class that defines the structure of the expected output. You create the parser by passing this model class to PydanticOutputParser. Then, use the parser's parse method to convert raw string outputs into instances of the Pydantic model.
- Pydantic model: Defines the output fields and types.
- PydanticOutputParser: Wraps the model to parse text outputs.
- parse method: Converts raw text to model instance.
python
from pydantic import BaseModel from langchain.output_parsers import PydanticOutputParser class MyOutputModel(BaseModel): name: str age: int parser = PydanticOutputParser(pydantic_object=MyOutputModel) # Example usage: raw_output = '{"name": "Alice", "age": 30}' parsed = parser.parse(raw_output) print(parsed.name, parsed.age)
Output
Alice 30
Example
This example shows how to define a Pydantic model for a person's name and age, create a PydanticOutputParser with it, and parse a JSON string output from a language model into a typed Python object.
python
from pydantic import BaseModel from langchain.output_parsers import PydanticOutputParser class Person(BaseModel): name: str age: int # Create the parser with the Pydantic model parser = PydanticOutputParser(pydantic_object=Person) # Simulated raw output from a language model raw_output = '{"name": "Bob", "age": 25}' # Parse the raw output into a Person instance person = parser.parse(raw_output) print(f"Name: {person.name}") print(f"Age: {person.age}")
Output
Name: Bob
Age: 25
Common Pitfalls
Common mistakes when using PydanticOutputParser include:
- Not matching the output format exactly to the Pydantic model fields and types.
- Passing raw text that is not valid JSON or does not conform to the model schema.
- Forgetting to handle parsing exceptions when the output is invalid.
Always ensure the language model outputs JSON matching your model, and catch errors when parsing.
python
from pydantic import BaseModel, ValidationError from langchain.output_parsers import PydanticOutputParser class Data(BaseModel): count: int parser = PydanticOutputParser(pydantic_object=Data) # Wrong output (missing quotes around keys) bad_output = '{count: 10}' try: parser.parse(bad_output) except ValidationError as e: print(f"Parsing failed: {e}") # Correct output good_output = '{"count": 10}' parsed = parser.parse(good_output) print(parsed.count)
Output
Parsing failed: 1 validation error for Data
count
field required (type=value_error.missing)
10
Quick Reference
- Define a Pydantic model with expected output fields.
- Create
PydanticOutputParserwith the model. - Use
parsemethod to convert raw string to model instance. - Ensure output format matches model schema (usually JSON).
- Handle exceptions for invalid outputs.
Key Takeaways
Define a Pydantic model to specify the output structure before using PydanticOutputParser.
Pass the Pydantic model class to PydanticOutputParser to create a parser instance.
Use the parser's parse method to convert raw text output into validated Pydantic objects.
Ensure the language model outputs JSON matching your Pydantic model fields and types.
Handle parsing errors gracefully to avoid crashes on invalid outputs.