Building a Trip Planner Agent

01The right test case

Why this example

Most agent demos use a simple tool call: look up the weather, search a database, summarise a document. These are fine for illustrating a single concept, but they don't show you what happens when the pieces interact.

A trip planner forces every concept from the enterprise agents page to work together at once. Intent is ambiguous, data comes from multiple services, some tasks can run in parallel, and the user expects a coherent result even when parts of the system fail.

Everything on this page — the event topics, the agent roles, the context window design, the fallback chain — is a direct consequence of those requirements. The system shape comes from the problem.

Why this is a good test case

1

Ambiguous intent

"Plan a trip to Tokyo in spring" is not a structured API call. The agent has to parse intent, fill in gaps, and ask when it genuinely doesn't know.

2

Multiple external APIs

Flights, hotels, weather, and activities are separate services with different latencies, rate limits, and failure modes. The agent can't treat them as one.

3

Natural parallelism

Once the destination and dates are known, flight and hotel searches can run at the same time. There is no reason to wait for one before starting the other.

4

Persistent session state

The user might refine the plan across several turns. The agent needs to remember what was decided, what was rejected, and why.

5

Graceful degradation

If the flight API is down, the agent shouldn't fail the whole request. It should work with what it has and tell the user clearly what it couldn't do.

The system you build for a trip planner is structurally identical to what you would build for a procurement agent, an onboarding assistant, or a research workflow. The domain changes; the architecture doesn't.

02From text to itinerary

What the agent needs to do

Before designing the system, it helps to walk through what the agent actually needs to do at each step. This is the lifecycle from first user message to final itinerary.

Request lifecycle

1 of 7

"I want to visit Tokyo for a week in late April, budget around $3,000, just me."

The most important design decision happens at step 3: how much to ask vs. how much to assume. Agents that ask too many questions feel unusable. Agents that assume too much produce wrong results. The right answer is: assume and surface your assumption, so the user can correct it.

03The full picture

System architecture

The system has five layers. A UI layer where the user interacts. An intent layer where free text becomes structured data. An event bus that decouples intent from execution. A pool of specialist agents that each own one concern. And an assembly layer that turns raw results into a final answer.

The event bus is the critical design choice. Without it, the orchestrator would call each agent in sequence — or in parallel but with a tangled web of direct calls. With it, agents are fully decoupled. You can add a new specialist (say, a visa requirements agent) without touching any existing code.

System architecture overview

Rendering diagram…

Each specialist agent is a consumer of one event type and a producer of one result event. This makes them independently deployable, independently testable, and independently replaceable.

04Topics, producers, consumers

Event orchestration design

The event bus is where choreography happens. Rather than an orchestrator calling each agent and waiting, the IntentAgent publishes one event and walks away. Each search agent reacts to it independently. The ResultAssembler listens for all four result events and assembles the plan when they arrive.

This is the fan-out pattern. One event, many consumers. No coupling between the producers and consumers. The IntentAgent has no idea who is listening.

Event topics — click to inspect

trip.intent.parsed

Producer

IntentAgent

Consumer

FlightAgent, HotelAgent, WeatherAgent, ActivitiesAgent

Purpose

Published once per request. Contains destination, dates, budget, origin, party size. All search agents subscribe to this topic.

Failure has its own event type. When an agent can't complete a search after retries, it publishes to the dead-letter topic rather than silently dropping the result. This means failures are visible, auditable, and retrievable without digging through logs.

05Who does what

Agent roles

Six agents, each with a single responsibility. The search agents — FlightAgent, HotelAgent, WeatherAgent, ActivitiesAgent — don't call external APIs directly. Each one submits an intent to the Lattice capability runtime and receives a structured projection back.

Agents — click to explore

🧠

IntentAgent

Parses the user's free-text request into a structured trip constraint object. Uses an LLM with a tight system prompt focused on extraction, not generation. Publishes trip.intent.parsed.

Input

Raw user message

Output

trip.intent.parsed event

Lattice Capabilities

search_capabilities(user_query)LLM call (extraction prompt)Memory lookup (user preferences)

This agent sets the quality ceiling for everything downstream. Bad extraction → bad itinerary.

Separating hard dependencies (flights, hotels) from soft ones (weather, activities) lets the ResultAssembler apply different timeout policies to each. Because failure modes are declared inside the Lattice capability — not improvised in agent code — each agent's timeout behaviour is consistent and auditable.

06Outcomes over raw API calls

Capabilities with Lattice

The search agents in this system don't call external APIs directly. Each one submits a typed intent to Lattice — a capability runtime that executes the underlying steps, manages retries and failure policy, keeps intermediate state out of the model's context, and returns a projection shaped for reasoning rather than integration.

The model sees two meta-tools regardless of how many capabilities exist. It never needs to know about the underlying APIs.

What the model sees — always exactly two tools

search_capabilities(query)

Returns matching capability signatures from the registry. The model calls this to discover what outcomes are available.

← "flight search for trip planner"

→ TripFlightSearch — inputs, projection schema, example

execute_capability(name, inputs)

Submits a typed intent to the runtime. Lattice loads the capability, executes all steps, and returns the projection. The model never sees intermediate state.

← "TripFlightSearch", {origin, dest, dates, budget}

→ FlightProjection {flights, best_option, status, alternatives}

Capability definitions — click to inspect

Capability signature

TripFlightSearch(
  origin: str,          # IATA code
  destination: str,     # IATA code
  departure_date: str,  # ISO 8601
  return_date: str,
  passengers: int,
  max_budget_usd: float
) -> FlightProjection

from lattice import capability, step, projection
from lattice.failure import retry, hard_failure, abort, fallback_to

@capability(name="TripFlightSearch", version="1.0")
async def trip_flight_search(ctx):

    @step(scope="travel.search")
    @retry(max=3, backoff="exponential",
           on=[TimeoutError, ServerError])
    @hard_failure(on_exhausted=fallback_to(search_backup))
    async def search_primary():
        client = ctx.client("flights_api_primary")
        return await client.search(
            origin=ctx.intent.origin,
            destination=ctx.intent.destination,
            depart=ctx.intent.departure_date,
            ret=ctx.intent.return_date,
            pax=ctx.intent.passengers,
            max_price=ctx.intent.max_budget_usd)

    @step(scope="travel.search")
    @retry(max=2, backoff="exponential",
           on=[TimeoutError, ServerError])
    @hard_failure(on_exhausted=abort)
    async def search_backup():
        client = ctx.client("flights_api_secondary")
        return await client.search(...)

    results = state.search_primary or state.search_backup
    top3 = sorted(results, key=lambda f: f.value_score)[:3]

    return projection(
        flights=top3,
        cheapest_usd=top3[0].total_price if top3 else None,
        best_option=top3[0] if top3 else None,
        status="found" if top3 else "none_in_budget",
        source="primary" if state.search_primary else "backup",
        alternatives=[] if top3 else [
            {"action": "expand_budget",
             "suggestion": "Increase budget by $200 to unlock 4 options"},
            {"action": "adjust_dates",
             "suggestion": "Shift by 3 days for 30% lower fares"},
        ]
    )

Adding a new step inside a capability — say, a price alert check inside TripFlightSearch — does not change the signature, does not add a new tool the model has to learn, and does not require any agent code to change. The capability is the unit of change, not the tool list.

07Building the right window

Context engineering

The IntentAgent builds a context window before each LLM call. The contents are assembled dynamically based on what's known: trip constraints extracted from the current message, user preferences pulled from memory, conversation history from the session, and any context retrieved from earlier agent results.

Not all of this fits in one call. Context windows have limits, and even within the limit, attention is not uniform. Items in the middle of a long context receive less weight. So the IntentAgent prioritises: hard constraints first, then preferences, then history.

Context window builder — toggle items to see token usage

Context usage

40 / 128 tokens (system prompt)

Light context — room to add retrieved docs or history.

User preferences from memory are not free. Every token you inject from a past session is a token you can't use for the current retrieved content. The IntentAgent makes this tradeoff explicit: preferences get a fixed token budget, and if the session history is long, it gets summarised before injection.

08Parallel execution in practice

Dynamic orchestration

Once the intent event is published, all four search agents start simultaneously. None of them wait for each other. The ResultAssembler holds until the hard dependencies — flights and hotels — are done. Soft dependencies contribute if they finish within the timeout window.

The three traces below show how this plays out across different types of requests. Select a query to see how intent was parsed, which agents ran in parallel, and what came back.

Intent extracted by IntentAgent

destination

Tokyo, Japan

dates

Apr 22–29, 2026

budget

$3,000

party

1 traveller

origin

JFK (assumed)

Parallel execution · wall time 7.0s

0s3.5s7.0s

IntentAgent

1.8s

FlightAgent

3.2s

HotelAgent

2.8s

WeatherAgent

1.4s

ActivitiesAgent

2.2s

ResultAssembler

2.0s

Wall time (critical path)

7.0s

If run sequentially

13.4s

Saved by parallelism

−6.4s

intent hard dependency soft dependency assembler

The timeline makes the tradeoff visible. Sequential execution for the Tokyo trace would take 13.4s. Parallel execution brings it to 7.0s. The time saved scales with the number of independent agents — which is why the fan-out pattern is worth the added complexity.

09What gets stored and where

Memory and session state

The trip planner needs three kinds of memory working at the same time. Working memory holds the current request — the constraints extracted in this session, the results received so far, any corrections the user has made. It lives in the session state attached to the request and disappears when the request is done.

Long-term memory holds user preferences: airports they prefer, hotel amenities they care about, destinations they've visited and want to avoid repeating. This is retrieved from a vector store at the start of each session and injected into the IntentAgent's context. It persists across requests.

The conversation buffer holds the dialogue from the current planning session. If the user rejects the first hotel option and says "something in Shinjuku instead," the buffer gives the IntentAgent the context to understand that correction without asking again.

Memory architecture

Rendering diagram…

Long-term memory is retrieved, not injected wholesale. The IntentAgent queries the vector store with the current request as a query, then brings in only the top-k most relevant preferences. Injecting everything a user has ever told the system would waste context tokens on irrelevant history.

10Fallback itinerary

Error handling

The FlightAgent is a hard dependency. If it fails completely, the itinerary can't include flights. The error handling chain tries to avoid that outcome through three layers: retry, fallback provider, and finally graceful degradation.

Flight API failure — step through the recovery chain

1 of 8

Standard search call with all constraints. Timeout set at 6 seconds.

If both primary and secondary providers fail after all retries, the ResultAssembler receives an empty flights array with an error field. The itinerary is generated without flights, and the user is told clearly: "We couldn't find flight options right now. Here are hotel and activity suggestions for your dates."

11End-to-end trace

Observability

Each agent emits an OpenTelemetry trace span when it starts and when it finishes. Tool calls are child spans. LLM calls are child spans with token counts attached as span attributes. The result is a complete trace tree that shows exactly where time was spent.

The trace below represents a real request. The root span covers the full request duration. You can see the sequential IntentAgent phase, the parallel search phase where flights and hotels run concurrently, and the assembly phase at the end.

OpenTelemetry trace — trip.plan (7.2s total)

trip.plan

7.2s

IntentAgent.parse

1.8s

LLM.extract_constraints

1.4s

VectorStore.get_preferences

0.2s

FlightAgent.search

3.2s

FlightsAPI.search (primary)

2.8s

HotelAgent.search

2.8s

HotelsAPI.search

2.4s

WeatherAgent.fetch

1.4s

ResultAssembler.build

2.0s

LLM.generate_itinerary

1.6s

root span agent span tool/LLM call

The trace immediately shows that the LLM call inside IntentAgent (1.4s) dominates the first phase, and the FlightAgent API call (2.8s) dominates the parallel phase. These are the two places to optimise first — prompt caching for the intent extraction, and request timeout tuning for the flights API.

12Everything in one diagram

Full architecture recap

Every component in one diagram. The user enters through the UI. The IntentAgent translates the request into a structured event. The event bus fans the event out to four search agents. Each agent calls its tool, handles its own failures, and publishes its result. The ResultAssembler waits for hard dependencies, calls the LLM, and sends the itinerary back.

The observability stack watches every hop. The memory store is read at the start and written when new preferences are detected. Dead-letter events from failed agents go to a monitoring service that can trigger retries or alert on-call.

Full system architecture

Rendering diagram…

This is the same structure you would use for a procurement agent, a research assistant, or a customer onboarding workflow. The domain changes. The topology — intent layer, event bus, specialist agents, assembler — stays the same.