LangchainHow-ToBeginner · 4 min read

How to Use PydanticOutputParser in Langchain for Structured Output

Use PydanticOutputParser in Langchain by defining a Pydantic model for your expected output, then create the parser with that model. Pass this parser to your language model chain to automatically convert raw text outputs into validated Pydantic objects.

📐

Syntax

The PydanticOutputParser requires a Pydantic model class that defines the structure of the expected output. You create the parser by passing this model class to PydanticOutputParser. Then, use the parser's parse method to convert raw string outputs into instances of the Pydantic model.

Pydantic model: Defines the output fields and types.
PydanticOutputParser: Wraps the model to parse text outputs.
parse method: Converts raw text to model instance.

python

from pydantic import BaseModel
from langchain.output_parsers import PydanticOutputParser

class MyOutputModel(BaseModel):
    name: str
    age: int

parser = PydanticOutputParser(pydantic_object=MyOutputModel)

# Example usage:
raw_output = '{"name": "Alice", "age": 30}'
parsed = parser.parse(raw_output)
print(parsed.name, parsed.age)

Output

Alice 30

💻

Example

This example shows how to define a Pydantic model for a person's name and age, create a PydanticOutputParser with it, and parse a JSON string output from a language model into a typed Python object.

python

from pydantic import BaseModel
from langchain.output_parsers import PydanticOutputParser

class Person(BaseModel):
    name: str
    age: int

# Create the parser with the Pydantic model
parser = PydanticOutputParser(pydantic_object=Person)

# Simulated raw output from a language model
raw_output = '{"name": "Bob", "age": 25}'

# Parse the raw output into a Person instance
person = parser.parse(raw_output)

print(f"Name: {person.name}")
print(f"Age: {person.age}")

Output

Name: Bob Age: 25

⚠️

Common Pitfalls

Common mistakes when using PydanticOutputParser include:

Not matching the output format exactly to the Pydantic model fields and types.
Passing raw text that is not valid JSON or does not conform to the model schema.
Forgetting to handle parsing exceptions when the output is invalid.

Always ensure the language model outputs JSON matching your model, and catch errors when parsing.

python

from pydantic import BaseModel, ValidationError
from langchain.output_parsers import PydanticOutputParser

class Data(BaseModel):
    count: int

parser = PydanticOutputParser(pydantic_object=Data)

# Wrong output (missing quotes around keys)
bad_output = '{count: 10}'

try:
    parser.parse(bad_output)
except ValidationError as e:
    print(f"Parsing failed: {e}")

# Correct output
good_output = '{"count": 10}'
parsed = parser.parse(good_output)
print(parsed.count)

Output

Parsing failed: 1 validation error for Data count field required (type=value_error.missing) 10

📊

Quick Reference

Define a Pydantic model with expected output fields.
Create PydanticOutputParser with the model.
Use parse method to convert raw string to model instance.
Ensure output format matches model schema (usually JSON).
Handle exceptions for invalid outputs.

✅

Key Takeaways

Define a Pydantic model to specify the output structure before using PydanticOutputParser.

Pass the Pydantic model class to PydanticOutputParser to create a parser instance.

Use the parser's parse method to convert raw text output into validated Pydantic objects.

Ensure the language model outputs JSON matching your Pydantic model fields and types.

Handle parsing errors gracefully to avoid crashes on invalid outputs.