How to Parse Structured Output in Langchain: Simple Guide
In Langchain, you parse structured output by using
OutputParser classes like StructuredOutputParser or RegexParser to convert raw model responses into usable Python objects. These parsers help extract data cleanly by defining expected formats or patterns.Syntax
To parse structured output in Langchain, you typically use an OutputParser instance. The main parts are:
OutputParser: Base class for parsing outputs.parse()method: Takes raw string output and returns structured data.- Specific parsers like
StructuredOutputParserorRegexParserdefine how to extract data.
python
from langchain.output_parsers import StructuredOutputParser # Create a parser with a defined schema parser = StructuredOutputParser.from_response_schemas([ {"name": "name", "description": "Person's name"}, {"name": "age", "description": "Person's age"} ]) # Use parser.parse() to convert raw output raw_output = '{"name": "Alice", "age": 30}' parsed = parser.parse(raw_output) print(parsed)
Output
{'name': 'Alice', 'age': 30}
Example
This example shows how to use StructuredOutputParser to parse a JSON-like string output from a language model into a Python dictionary with specific fields.
python
from langchain.output_parsers import StructuredOutputParser # Define the expected output schema schemas = [ {"name": "city", "description": "Name of the city"}, {"name": "temperature", "description": "Current temperature in Celsius"} ] # Create the parser parser = StructuredOutputParser.from_response_schemas(schemas) # Simulated raw output from a language model raw_output = '{"city": "Paris", "temperature": 18}' # Parse the output parsed_output = parser.parse(raw_output) print(parsed_output)
Output
{'city': 'Paris', 'temperature': 18}
Common Pitfalls
Common mistakes when parsing structured output in Langchain include:
- Not matching the output format exactly, causing parse errors.
- Using raw string output without cleaning or validating it first.
- Ignoring exceptions raised by
parse()when output is malformed. - Confusing the parser's expected schema with the actual output format.
Always ensure your language model outputs match the parser's expected structure.
python
from langchain.output_parsers import StructuredOutputParser schemas = [{"name": "name", "description": "Person's name"}] parser = StructuredOutputParser.from_response_schemas(schemas) # Wrong output format (missing quotes around name) raw_output_wrong = '{name: Alice}' try: parser.parse(raw_output_wrong) except Exception as e: print(f"Error parsing output: {e}") # Correct output format raw_output_right = '{"name": "Alice"}' parsed = parser.parse(raw_output_right) print(parsed)
Output
Error parsing output: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
{'name': 'Alice'}
Quick Reference
Tips for parsing structured output in Langchain:
- Use
StructuredOutputParser.from_response_schemas()to define expected fields. - Always validate or clean raw output before parsing.
- Handle exceptions from
parse()to catch malformed outputs. - Test your parser with sample outputs to ensure compatibility.
Key Takeaways
Use Langchain's OutputParser classes to convert raw model output into structured data.
Define clear schemas matching expected output fields for reliable parsing.
Always handle parsing errors to avoid crashes from unexpected output formats.
Test parsers with sample outputs to ensure they work as intended.