Agent Tracing
Key Points
- Trace agent execution end-to-end: agent → LLM call → tool call → DB call → next LLM call → ...
- OpenTelemetry handles via standard tracing; agents propagate context.
- Span hierarchy mirrors the call tree.
- For multi-agent orchestrations: each agent + each step a span; orchestration tag links them.
- Without tracing, agents are black boxes — impossible to debug.
Span hierarchy
[POST /chat] ← incoming
└─ [Agent: TripPlanner.InvokeAsync]
├─ [chat (gen_ai.system=openai)] ← LLM 1: decide tool
├─ [Tool: SearchFlights] ← tool execution
│ └─ [HTTP GET https://flight-api/]
├─ [chat (gen_ai.system=openai)] ← LLM 2: synthesize
└─ [agent.response]
Each Activity creates a span; nested spans show the call tree.
Setup
chat = chat.AsBuilder()
.UseOpenTelemetry()
.UseFunctionInvocation()
.Build();
// In agent code:
using var activity = ActivitySource.StartActivity("Agent.InvokeAsync");
activity?.SetTag("agent.name", agent.Name);
Tags to add
activity?.SetTag("agent.name", agent.Name);
activity?.SetTag("orchestration", orchestrationName);
activity?.SetTag("session.id", sessionId);
activity?.SetTag("user.id", userId);
activity?.SetTag("tenant.id", tenantId);
Used to filter / aggregate traces.
Multi-agent orchestration
public async Task<AgentResponse> InvokeAsync(string input, CancellationToken ct)
{
using var activity = _src.StartActivity("Orchestration.Sequential");
activity?.SetTag("agents", string.Join(",", _agents.Select(a => a.Name)));
foreach (var agent in _agents)
{
using var stepActivity = _src.StartActivity($"Step.{agent.Name}");
stepActivity?.SetTag("agent.name", agent.Name);
var resp = await agent.InvokeAsync(messages, ct);
// ...
}
}
Tool tracing
UseFunctionInvocation() middleware automatically wraps tool calls in spans:
Tag with tool name, args summary (sensitive data off by default), result type.
Status / errors
try { /* ... */ activity?.SetStatus(ActivityStatusCode.Ok); }
catch (Exception ex)
{
activity?.SetStatus(ActivityStatusCode.Error, ex.Message);
activity?.RecordException(ex);
throw;
}
Errors propagate up the trace tree.
Token usage per agent
private static readonly Counter<long> _tokens = _meter.CreateCounter<long>("agent.tokens");
_tokens.Add(usage.InputTokenCount, new TagList
{
{ "agent.name", agent.Name },
{ "type", "input" },
{ "model", usage.Model }
});
Per-agent cost dashboards.
Visualizing in tools
- Datadog APM: agent spans render as nested service map.
- Jaeger / Tempo: trace tree view.
- App Insights: end-to-end transaction.
- LangSmith (commercial): AI-specific UI.
Sampling
For high-traffic apps:
10% of traces. Always sample errors:
Cross-process tracing
If agents call other services (HTTP, gRPC, queue), trace context propagates:
OTel's W3C TraceContext header propagation handles automatically.
Debugging long agent chains
1. Open trace in viewer.
2. See full call tree.
3. Identify slow / failed step.
4. Drill into that span's details.
5. Fix.
Without traces: log files; correlation by request ID; minutes-to-hours per investigation.
Senior considerations
- Always trace agent flows — debug nightmare otherwise.
- Tag with semantic context — orchestration, agent, user, tenant.
- Sample for cost — but keep all errors.
- Correlate metrics + traces via TraceId.