Creating Type-Safe LLM Agents with Pydantic AI: A Guide to Reliable Structured Outputs

Introduction

Large language models (LLMs) are incredibly powerful, but their free-form text responses can be a headache to integrate into deterministic code. Enter Pydantic AI, a Python framework that bridges the gap by letting you build LLM agents that return validated, structured outputs using Pydantic models. Instead of wrestling with raw strings, you get type-safe objects with automatic validation—no more guesswork or fragile parsing.

Creating Type-Safe LLM Agents with Pydantic AI: A Guide to Reliable Structured Outputs — Source: realpython.com

If you’ve used FastAPI or Pydantic before, the pattern will feel refreshingly familiar: define your data schemas with Python type hints, and let the framework handle the rest. In this guide, we’ll explore how Pydantic AI turns LLM responses into reliable, validated data, covering everything from basic schema definitions to advanced features like dependency injection and automatic retries.

Core Concepts of Pydantic AI

Defining Structured Outputs with BaseModel

At the heart of Pydantic AI is the BaseModel class from Pydantic. You define your desired output as a BaseModel subclass with type-annotated fields. This schema tells the LLM exactly what structure you expect—and thanks to Pydantic’s built-in validation, any mismatched or missing data triggers an automatic retry (more on that later).

from pydantic import BaseModel

class WeatherForecast(BaseModel):
    city: str
    temperature: float
    unit: str = "Celsius"

This approach guarantees that every response from your agent adheres to the schema. No more guessing if the LLM decided to format the temperature as a string or omit the unit. The result is a clean, validated Python object you can use directly in your application.

Using the @agent.tool Decorator for Function Calling

Agents in Pydantic AI can call functions based on user queries. The @agent.tool decorator registers a Python function that the LLM can invoke. The function’s docstring and type hints act as instructions to the model, telling it when and how to call the tool.

from pydantic_ai import Agent

agent = Agent('openai:gpt-4')

@agent.tool
async def get_weather(ctx, city: str) -> str:
    """Get the current weather for a given city."""
    # actual API call here
    return f"The weather in {city} is sunny, 22°C."

When a user asks about the weather, the LLM automatically decides to call get_weather, passing the extracted city name. The tool’s return value is then incorporated into the final response. This pattern keeps your logic modular and type-safe.

Advanced Features for Robust Agents

Type-Safe Dependency Injection

Real-world agents often need access to external resources like databases, APIs, or configuration. Pydantic AI supports dependency injection via the deps_type parameter. You define a dependency type (again, a Pydantic model) and inject it into your tool functions without relying on global state.

from pydantic import BaseModel
from pydantic_ai import Agent

class MyDeps(BaseModel):
    db_connection: str  # simplified

agent = Agent('openai:gpt-4', deps_type=MyDeps)

@agent.tool
async def query_data(ctx, query: str) -> str:
    """Run a database query."""
    conn = ctx.deps.db_connection
    # use connection to run query
    return "result"

This keeps your code clean, testable, and type-safe. The ctx.deps attribute gives you access to the dependency object, and Pydantic validates its structure before the agent runs.

Automatic Retry on Validation Failure

LLMs are probabilistic—sometimes they produce outputs that don’t match your schema. Pydantic AI handles this gracefully with validation retries. When the model returns data that fails Pydantic’s validation, the framework automatically re-queries the LLM with instructions to fix the error. This dramatically increases reliability, though it does increase API costs.

You can configure the maximum number of retries or disable them entirely. For critical applications, validation retries are a lifesaver, ensuring that your agent never returns malformed data.

Provider Support and Limitations

Not all LLM providers are equally good at producing structured outputs. Pydantic AI works best with Google Gemini, OpenAI (GPT-4, GPT-3.5), and Anthropic (Claude) models, which natively support function calling and consistent JSON generation. Other providers may still work but often require more careful prompt engineering or produce less reliable results.

When choosing a provider, consider both its structured output capabilities and the validation retry mechanism. A provider that frequently returns invalid data will increase your API costs as retries pile up.

Conclusion and Next Steps

Pydantic AI brings the reliability of type validation to the messy world of LLM outputs. By combining BaseModel schemas, @agent.tool decorators, and dependency injection, you can build agents that are both powerful and predictable. The automatic retry system further ensures your agent behaves as expected, even when the LLM stumbles.

To get started, install pydantic-ai and experiment with the examples in the official documentation. As you build more complex agents, you’ll appreciate how Pydantic AI keeps your code clean and your data safe.

Happy coding!

Tags: