Agent Tracing

Key Points

Trace agent execution end-to-end: agent → LLM call → tool call → DB call → next LLM call → ...
OpenTelemetry handles via standard tracing; agents propagate context.
Span hierarchy mirrors the call tree.
For multi-agent orchestrations: each agent + each step a span; orchestration tag links them.
Without tracing, agents are black boxes — impossible to debug.

Span hierarchy

[POST /chat]                                     ← incoming
└─ [Agent: TripPlanner.InvokeAsync]
   ├─ [chat (gen_ai.system=openai)]              ← LLM 1: decide tool
   ├─ [Tool: SearchFlights]                       ← tool execution
   │  └─ [HTTP GET https://flight-api/]
   ├─ [chat (gen_ai.system=openai)]              ← LLM 2: synthesize
   └─ [agent.response]

Each Activity creates a span; nested spans show the call tree.

Setup

chat = chat.AsBuilder()
    .UseOpenTelemetry()
    .UseFunctionInvocation()
    .Build();

// In agent code:
using var activity = ActivitySource.StartActivity("Agent.InvokeAsync");
activity?.SetTag("agent.name", agent.Name);

Tags to add

activity?.SetTag("agent.name", agent.Name);
activity?.SetTag("orchestration", orchestrationName);
activity?.SetTag("session.id", sessionId);
activity?.SetTag("user.id", userId);
activity?.SetTag("tenant.id", tenantId);

Used to filter / aggregate traces.

Multi-agent orchestration

public async Task<AgentResponse> InvokeAsync(string input, CancellationToken ct)
{
    using var activity = _src.StartActivity("Orchestration.Sequential");
    activity?.SetTag("agents", string.Join(",", _agents.Select(a => a.Name)));

    foreach (var agent in _agents)
    {
        using var stepActivity = _src.StartActivity($"Step.{agent.Name}");
        stepActivity?.SetTag("agent.name", agent.Name);
        var resp = await agent.InvokeAsync(messages, ct);
        // ...
    }
}

Tool tracing

UseFunctionInvocation() middleware automatically wraps tool calls in spans:

[chat]
└─ [function.GetWeather]
   └─ [HTTP GET https://weather-api/]

Tag with tool name, args summary (sensitive data off by default), result type.

Status / errors

try { /* ... */ activity?.SetStatus(ActivityStatusCode.Ok); }
catch (Exception ex)
{
    activity?.SetStatus(ActivityStatusCode.Error, ex.Message);
    activity?.RecordException(ex);
    throw;
}

Errors propagate up the trace tree.

Token usage per agent

private static readonly Counter<long> _tokens = _meter.CreateCounter<long>("agent.tokens");

_tokens.Add(usage.InputTokenCount, new TagList
{
    { "agent.name", agent.Name },
    { "type", "input" },
    { "model", usage.Model }
});

Per-agent cost dashboards.

Visualizing in tools

Datadog APM: agent spans render as nested service map.
Jaeger / Tempo: trace tree view.
App Insights: end-to-end transaction.
LangSmith (commercial): AI-specific UI.

Sampling

For high-traffic apps:

.WithTracing(t => t.SetSampler(new ParentBasedSampler(new TraceIdRatioBasedSampler(0.1))))

10% of traces. Always sample errors:

// Tail sampling at collector — keep all error traces

Cross-process tracing

If agents call other services (HTTP, gRPC, queue), trace context propagates:

[Agent A] → HTTP → [Service B]
   ↓                  ↓
   span               linked span

OTel's W3C TraceContext header propagation handles automatically.

Debugging long agent chains

1. Open trace in viewer.
2. See full call tree.
3. Identify slow / failed step.
4. Drill into that span's details.
5. Fix.

Without traces: log files; correlation by request ID; minutes-to-hours per investigation.

Senior considerations

Always trace agent flows — debug nightmare otherwise.
Tag with semantic context — orchestration, agent, user, tenant.
Sample for cost — but keep all errors.
Correlate metrics + traces via TraceId.