Multi-Agent Patterns: A Practical Guide

05 Mar 2025

Introduction
The Domain Challenge
Pattern 1: Single Agent Workflows
Pattern 2: Agent Delegation
Pattern 3: Programmatic Agent Hand-off
Pattern 4: Graph-based Control Flow
Choosing the Right Pattern
Best Practices
Conclusion
Further Reading

Introduction

AI applications are shifting from monolithic large language models to modular, multi-agent systems—a transformation that enhances performance, flexibility, and maintainability.

In my talk about AI Agents last September, I said AI agents would become increasingly popular, and today, we see this shift happening across industries. By breaking down complex tasks into specialized components, engineers can design smarter, more scalable AI workflows.

In this guide, we'll explore four key multi-agent patterns, using travel booking as an example. You'll learn how to choose the right pattern for your application, implementation strategies, and error-handling techniques to build robust multi-agent AI systems.

The Domain Challenge

While our examples focus on travel applications, the challenges addressed here are common across many industries:

Complex User Interactions

Interpreting natural language preferences
Processing dates, locations and multi-step workflows

Sequential Processing Requirements

Searching and filtering (e.g. flight searches)
Selecting seats, ancillary services, payment processing, and final confirmation

External System Integration

Interacting with airline reservation systems, payment gateways, hotel PMS, etc.

Domain Expertise Requirements

Understanding fare rules, pricing optimisation, route planning, and travel regulations

Below is a mermaid diagram that visualises these challenges:

Pattern 1: Single Agent Workflows

Implementation Details

The Single Agent Workflow pattern is ideal for focused, well-defined tasks. Here's a simplified example of a flight search agent:

from pydantic import BaseModel
from typing import List, Optional
from datetime import date

class FlightDetails(BaseModel):
    flight_number: str
    price: float
    origin: str
    destination: str
    departure_date: date
    arrival_date: date
    duration_hours: float

class FlightSearchResult(BaseModel):
    found_flights: List[FlightDetails]
    best_flight: Optional[FlightDetails] = None
    explanation: str

class FlightSearchAgent:
    def __init__(self, model: str):
        self.system_prompt = """You are a flight search expert. Your job is to:
        1. Search available flights using the search_flights tool
        2. Analyze the results to find the best option based on price and duration
        3. Provide a clear explanation of why you chose that flight
        """
        # ... initialization code ...

    async def search_flights(self, origin: str, destination: str, date: date) -> List[FlightDetails]:
        # Implementation to search flights
        pass

When to Use Single Agent Workflows

Single Agent Workflows are most effective when:

Task Boundaries are Clear:
- E.g. flight searches with specific criteria, seat selection, or baggage allowance calculations.
Minimal State Management is Required:
- Ideal for one-off, stateless queries and independent transactions.
Performance is Critical:
- Such as in real-time pricing updates or immediate availability checks.

Implementation Considerations

Aspect	Consideration	Example
Input Validation	Ensure all required parameters are provided	Check date formats, airport codes
Error Handling	Comprehensive error catching is essential	Handling API timeouts, invalid responses
Response Format	Structure responses for easy consumption	Use Pydantic models for type safety
Monitoring	Track performance and success rates	Log response times and error rates

Code Example: Error Handling

Robust error handling is crucial. The example below demonstrates how to implement retries with exponential backoff:

class FlightSearchError(Exception):
    """Base exception for flight search errors"""
    pass

class InvalidSearchCriteriaError(FlightSearchError):
    """Raised when search criteria is invalid"""
    pass

async def search_with_retries(self, criteria: FlightSearchCriteria, max_retries: int = 3) -> List[FlightSearchResult]:
 retry_count = 0
    while retry_count < max_retries:
        try:
            return await self._perform_search(criteria)
        except FlightSearchError as e:
 retry_count += 1
            if retry_count == max_retries:
                raise
            await asyncio.sleep(1 * retry_count)  # Exponential backoff

A corresponding sequence diagram visualises the process:

Pattern 2: Agent Delegation

Implementation Architecture

The Agent Delegation pattern uses a controller agent to coordinate specialized delegate agents. Here's a simplified example:

from typing import List, Optional
from datetime import date
from pydantic import BaseModel

class TravelPlan(BaseModel):
    outbound_flight: FlightDetails
    return_flight: Optional[FlightDetails]
    hotel_recommendations: List[str]
    activity_suggestions: List[str]
    total_budget: float

# Flight search agent (delegate)
flight_search_agent = Agent(
    model="gpt-4",
    result_type=List[FlightDetails],
    system_prompt="You are a flight search specialist. Find the best flights matching the given criteria."
)

# Travel planner agent (controller)
travel_planner_agent = Agent(
    model="gpt-4",
    result_type=TravelPlan,
    system_prompt="""You are a travel planning expert. Your job is to:
    1. Use the flight_search tool to find suitable flights
    2. Suggest hotels and activities at the destination
    3. Calculate total budget including flights and estimated other expenses
    """
)

Benefits and Trade-offs

Aspect	Benefits	Trade-offs
Modularity	Clear division of responsibilities	Increased system complexity
Scalability	Easy to add specialist agents	More points of failure
Maintenance	Agents can be updated independently	Requires robust error handling
Performance	Parallel processing is possible	Additional coordination overhead

Error Handling Strategies

Coordinated error handling is key. For example:

class TravelPlanningError(Exception):
    """Base exception for travel planning errors"""
    pass

class AgentCoordinationError(TravelPlanningError):
    """Raised when agent coordination fails"""
    pass

class TravelPlannerAgent:
    async def create_travel_plan_with_fallback(self, origin: str, destination: str, dates: Dict[str, datetime], preferences: Dict[str, any]) -> TravelPlan:
        try:
            return await self.create_travel_plan(origin, destination, dates, preferences)
        except FlightSearchError:  # Fallback to alternative dates or routes
            return await self._handle_flight_search_failure(origin, destination, dates, preferences)
        except HotelSearchError:  # Fallback to alternative accommodations
            return await self._handle_hotel_search_failure(destination, dates, preferences)
        except ActivityPlanningError:  # Create a plan without activities
            return await self._create_basic_travel_plan(origin, destination, dates, preferences)

Implementation Considerations

Agent Communication: Use standardised message formats and implement retry mechanisms.
State Management: Track the progress of each sub-agent and maintain booking consistency.
Monitoring and Logging: Monitor performance metrics and log coordination events.
Testing: Unit test individual agents and integration test their interactions.

Pattern 3: Programmatic Agent Hand-off

Implementation Details

This pattern manages explicit transitions between specialized agents:

# Flight search agent
flight_search_agent = Agent(
    model="gpt-4",
    result_type=List[FlightDetails],
    system_prompt="Find available flights matching the search criteria."
)

# Seat selection agent
seat_selection_agent = Agent(
    model="gpt-4",
    result_type=SeatPreference,
    system_prompt="""Help users select their seat based on their preferences.
    - Rows 1, 14, and 20 have extra legroom
    - Seats A and F are window seats
    """
)

# Payment processing agent
payment_agent = Agent(
    model="gpt-4",
    result_type=PaymentResult,
    system_prompt="Process payment and generate booking confirmation."
)

async def process_booking():
    # Step 1: Search for flights
    flight = await search_flights(usage=usage, origin="SFO", destination="JFK", travel_date=date(2024, 5, 1))

    # Step 2: Select seat
    seat = await select_seat(usage=usage, preferences="window seat with extra legroom")

    # Step 3: Process payment
    payment_result = await process_payment(usage=usage, flight=flight, seat=seat, payment_info="Credit card ending in 1234")

Error Handling Strategies

Key aspects include state-based recovery, graceful degradation, and robust monitoring. Below is a mermaid state diagram illustrating possible transitions:

stateDiagram-v2
 [*] --> INITIATED
 INITIATED --> FLIGHTS_SELECTED: Flight search success
 INITIATED --> FAILED: Search error
 FLIGHTS_SELECTED --> SEATS_SELECTED: Seat selection success
 FLIGHTS_SELECTED --> FAILED: Selection error
 SEATS_SELECTED --> PAYMENT_PROCESSING: Payment initiated
 PAYMENT_PROCESSING --> COMPLETED: Payment success
 PAYMENT_PROCESSING --> FAILED: Payment error
 FAILED --> [*]
 COMPLETED --> [*]

Usage Example

An API endpoint that processes a booking using the orchestrator:

@router.post("/bookings")
async def create_booking(booking_request: BookingRequest, orchestrator: BookingOrchestrator = Depends(get_orchestrator)) -> BookingResponse:
 context = BookingContext(
        booking_id=generate_booking_id(),
        state=BookingState.INITIATED,
        user_id=booking_request.user_id,
        search_criteria=booking_request.search_criteria,
        created_at=datetime.utcnow(),
        updated_at=datetime.utcnow()
    )

 final_context = await orchestrator.process_booking(context)

    return BookingResponse(
        booking_id=final_context.booking_id,
        status=final_context.state.value,
        error=final_context.error,
        flights=final_context.flights,
        seats=final_context.selected_seats,
        payment_info=final_context.payment_info
    )

Implementation Considerations

State Management: Persist the BookingContext (e.g. using a database).
Retry Strategies: Use exponential backoff where necessary.
Transaction Boundaries: Maintain atomicity of state transitions.
Monitoring: Record state transition times and error counts.

Pattern 4: Graph-based Control Flow

Implementation Details

This pattern uses a graph structure to manage complex workflows:

@dataclass
class BookingState:
    """State maintained throughout the booking process."""
    origin: str
    destination: str
    travel_date: date
    selected_flight: Optional[FlightDetails] = None
    selected_seat: Optional[SeatPreference] = None
    payment_info: Optional[str] = None
    booking_confirmed: bool = False
    agent_messages: Dict[str, List[ModelMessage]] = field(default_factory=dict)

@dataclass
class SearchFlights(BaseNode[BookingState]):
    """Search for available flights."""
    async def run(self, ctx: GraphRunContext[BookingState]) -> BaseNode[BookingState] | End[None]:
        result = await flight_search_agent.run(
            f"Find flights from {ctx.state.origin} to {ctx.state.destination} on {ctx.state.travel_date}"
        )
        ctx.state.selected_flight = result.data[0]
        return SelectSeat()

@dataclass
class SelectSeat(BaseNode[BookingState]):
    """Select seat preferences for the flight."""
    async def run(self, ctx: GraphRunContext[BookingState]) -> BaseNode[BookingState]:
        result = await seat_selection_agent.run("I'd like a window seat with extra legroom if possible")
        ctx.state.selected_seat = result.data
        return ProcessPayment()

Implementation Considerations

State Design: Define clear transitions and metadata.
Error Recovery: Implement retryable and monitored nodes.
Monitoring and Metrics: Track execution times, transitions, and errors.

Below is a mermaid diagram representing a typical graph-based workflow:

stateDiagram-v2
 [*] --> SearchFlights
 SearchFlights --> SelectFlights: Flights found
 SearchFlights --> [*]: Error
 SelectFlights --> SearchHotels: Flights selected
 SelectFlights --> SearchFlights: Retry search
 SelectFlights --> [*]: Error
 SearchHotels --> SelectHotels: Hotels found
 SearchHotels --> [*]: Error
 SelectHotels --> SelectSeats: Hotels selected
 SelectSeats --> ProcessPayment: Seats selected
 SelectSeats --> SelectSeats: Retry selection
 ProcessPayment --> [*]: Success/Error

Usage Example

An API endpoint employing a graph-based runner might look like this:

@router.post("/bookings/graph")
async def create_booking_graph(booking_request: BookingRequest, runner: BookingGraphRunner = Depends(get_graph_runner)) -> BookingResponse:
 initial_state = BookingGraphState(
        booking_id=generate_booking_id(),
        user_id=booking_request.user_id,
        origin=booking_request.origin,
        destination=booking_request.destination,
        dates=booking_request.dates,
        preferences=booking_request.preferences,
        agent_messages={}
    )
 final_state = await runner.run(initial_state)
    return BookingResponse(
        booking_id=final_state.booking_id,
        status="completed" if not final_state.error else "failed",
        error=final_state.error,
        flights=final_state.flights,
        hotels=final_state.hotels,
        selected_seats=final_state.selected_seats,
        payment_info=final_state.payment_info
    )

Choosing the Right Pattern

Comparative Analysis

Aspect	Single Agent	Agent Delegation	Programmatic Hand-off	Graph-based Flow
Complexity	Low	Medium	High	Very High
State Management	Simple	Moderate	Complex	Very Complex
Error Handling	Basic	Coordinated	Programmatic	Comprehensive
Scalability	Limited	Good	Very Good	Excellent
Development Speed	Fast	Medium	Slow	Very Slow
Maintenance	Easy	Moderate	Complex	Very Complex
Flexibility	Limited	Good	Very Good	Excellent
Monitoring	Simple	Moderate	Complex	Very Complex

Real-World Applications

Flight Booking Systems

Example: A multi-city flight booking system using Graph-based Flow to manage complex routing.

Hotel Booking Systems

Example: Using Agent Delegation for optimised hotel search and pricing.

Travel Package Systems

Example: Building dynamic packages via Programmatic Hand-off to sequence flights, hotels and activities.

Travel Itinerary Systems

Example: Employing Single Agent Workflows for fast, AI-powered itinerary optimisation.

Best Practices

Pattern Selection

Begin with the most straightforward pattern that meets your needs.
Consider future scalability and complexity before adding layers.

Implementation

Use strong typing with Pydantic models.
Incorporate comprehensive logging and monitoring.
Build robust error handling and retry strategies.

Testing

Unit test individual agents.
Integration test agent interactions.
Execute end-to-end tests and stress-test state management.

Monitoring

Track performance metrics and state transitions.
Log error rates and adjust retry strategies as needed.

Conclusion

Multi-agent patterns provide a structured and scalable approach to managing complex tasks in modern applications.

Whether you're implementing a simple, single-agent workflow or a comprehensive, graph-based control flow, the right pattern will ensure clarity, reliability and an excellent user experience.

By following these best practices and real-world examples from the travel domain, you can build robust and adaptable systems.

Multi-Agent Patterns: A Practical Guide

Table of Contents

Introduction

The Domain Challenge

Complex User Interactions

Sequential Processing Requirements

External System Integration

Domain Expertise Requirements

Pattern 1: Single Agent Workflows

Implementation Details

When to Use Single Agent Workflows

Implementation Considerations

Code Example: Error Handling

Pattern 2: Agent Delegation

Implementation Architecture

Benefits and Trade-offs

Error Handling Strategies

Implementation Considerations

Pattern 3: Programmatic Agent Hand-off

Implementation Details

Error Handling Strategies

Usage Example

Implementation Considerations

Pattern 4: Graph-based Control Flow

Implementation Details

Implementation Considerations

Usage Example

Choosing the Right Pattern

Comparative Analysis

Real-World Applications

Flight Booking Systems

Hotel Booking Systems

Travel Package Systems

Travel Itinerary Systems

Best Practices

Pattern Selection

Implementation

Testing

Monitoring

Conclusion

Further Reading