Get structured output from LLMs with Instructor
Instructor is the most popular library for structured LLM outputs. It patches the OpenAI and Anthropic SDKs to add automatic validation, retries, and streaming of Pydantic or Zod models.
Prerequisites
- +Python 3.10+ or Node 20+
- +OpenAI or Anthropic API key
- +Familiarity with Pydantic (Python) or Zod (TypeScript)
Step-by-Step
- 1
Install Instructor
Instructor works as a wrapper around the official SDKs.
pip install instructor openai - 2
Patch the client
Use instructor.from_openai() to add structured output capabilities to your client.
import instructor from openai import OpenAI from pydantic import BaseModel client = instructor.from_openai(OpenAI()) - 3
Define your response model
Pydantic models define the structure. Field descriptions guide the model.
from typing import list class UserInfo(BaseModel): """Extracted user information from text.""" name: str age: int email: str | None = None interests: list[str] - 4
Extract structured data
Use response_model parameter. Instructor handles schema conversion, validation, and retries.
user = client.chat.completions.create( model='gpt-4o-mini', response_model=UserInfo, messages=[{ 'role': 'user', 'content': 'John is 25 years old, loves hiking and photography' }], ) print(user.name, user.age, user.interests) - 5
Add automatic retries
Instructor retries on validation failure. Set max_retries to control attempts.
from instructor import Mode user = client.chat.completions.create( model='gpt-4o-mini', response_model=UserInfo, max_retries=3, messages=[{'role': 'user', 'content': text}], ) - 6
Stream partial objects
Get structured data as it streams. Great for real-time UIs.
from instructor import Partial for partial in client.chat.completions.create( model='gpt-4o-mini', response_model=Partial[UserInfo], stream=True, messages=[{'role': 'user', 'content': text}], ): print(partial.model_dump())
Common Pitfalls
- !Using response_model without patching the client first.
- !Missing docstrings on models - they are used as system prompt context.
- !Setting max_retries too high burns tokens on fundamentally malformed inputs.
- !Forgetting that Partial streaming yields incomplete objects until the stream ends.
DevDigest Academy
Structured AI engineering courses with hands-on labs. Build production-ready apps faster.
What's Next
- ->Add custom validators for business logic.
- ->Try instructor-js for TypeScript with the same API.
- ->Combine with async for concurrent extractions.
