IChatClient & Pipeline

Key Points

IChatClient is the canonical .NET chat abstraction. Vendor-neutral.
Pipeline pattern: chain middleware with AsBuilder(). Each Use* wraps the next.
Built-in middleware: function invocation, logging, OpenTelemetry, distributed cache, options telemetry.
Streaming via GetStreamingResponseAsync returns IAsyncEnumerable<ChatResponseUpdate>.
Function calling: declare functions with [Description]; UseFunctionInvocation() middleware auto-dispatches.

The `IChatClient` interface

public interface IChatClient : IDisposable
{
    Task<ChatResponse> GetResponseAsync(
        IEnumerable<ChatMessage> messages,
        ChatOptions? options = null,
        CancellationToken cancellationToken = default);

    IAsyncEnumerable<ChatResponseUpdate> GetStreamingResponseAsync(
        IEnumerable<ChatMessage> messages,
        ChatOptions? options = null,
        CancellationToken cancellationToken = default);

    object? GetService(Type serviceType, object? serviceKey = null);

    ChatClientMetadata Metadata { get; }
}

Every concrete client (OpenAI, Azure OpenAI, Anthropic adapter, Ollama, ONNX) implements this.

Basic usage

IChatClient chat = new OpenAIClient(apiKey).AsChatClient("gpt-4o-mini");

var response = await chat.GetResponseAsync("Hello!");
Console.WriteLine(response.Text);

Streaming

await foreach (var update in chat.GetStreamingResponseAsync("Tell me a story"))
{
    Console.Write(update.Text);
}

ChatResponseUpdate has incremental text + role + completion metadata.

Multi-turn

var messages = new List<ChatMessage>
{
    new(ChatRole.System, "You are a helpful assistant."),
    new(ChatRole.User, "What's 2+2?"),
    new(ChatRole.Assistant, "4."),
    new(ChatRole.User, "Multiplied by 10?")
};

var resp = await chat.GetResponseAsync(messages);

ChatOptions

new ChatOptions
{
    Temperature = 0.2f,
    MaxOutputTokens = 500,
    StopSequences = ["END"],
    Tools = [AIFunctionFactory.Create(GetWeather)],
    ToolMode = ChatToolMode.Auto,
    ResponseFormat = ChatResponseFormat.Json,
    AdditionalProperties = new() { ["seed"] = 42 }
}

AdditionalProperties for vendor-specific (reasoning_effort, etc.).

Pipeline (`AsBuilder`)

chat = chat.AsBuilder()
    .UseFunctionInvocation()
    .UseLogging(loggerFactory)
    .UseOpenTelemetry(sourceName: "Application", configure: c => c.EnableSensitiveData = false)
    .UseDistributedCache(cache)
    .Build();

Each Use* returns a new builder; Build() produces the final IChatClient.

Pipeline executes outside-in:

Caller → DistCache → OpenTelemetry → Logging → FunctionInvocation → underlying
                                                                  ←

Function calling

[Description("Gets the current weather for a city")]
async Task<Weather> GetWeather([Description("City name")] string city, CancellationToken ct = default)
{
    var w = await _service.GetAsync(city, ct);
    return new Weather(w.Temp, w.Condition);
}

// Use:
var resp = await chat.GetResponseAsync(
    "What's the weather in Seattle?",
    new ChatOptions { Tools = [AIFunctionFactory.Create(GetWeather)] });

UseFunctionInvocation() middleware: 1. Sees the LLM's tool-call response. 2. Dispatches to the .NET method. 3. Adds tool result to messages. 4. Calls LLM again. 5. Returns final answer.

Multiple iterations until LLM stops calling tools.

Function from instance methods

public class Calculator
{
    [Description("Adds two numbers")]
    public int Add(int a, int b) => a + b;
}

new ChatOptions
{
    Tools = [AIFunctionFactory.Create(_calculator.Add)]
};

DI registration

builder.Services.AddSingleton<IChatClient>(sp =>
{
    var client = new OpenAIClient(apiKey).AsChatClient("gpt-4o-mini");
    return client.AsBuilder()
        .UseFunctionInvocation()
        .UseLogging(sp.GetRequiredService<ILoggerFactory>())
        .UseOpenTelemetry()
        .UseDistributedCache(sp.GetRequiredService<IDistributedCache>())
        .Build();
});

Or named clients via keyed services for multi-provider setups.

ChatMessage content

new ChatMessage(ChatRole.User, [
    new TextContent("Describe this image:"),
    new DataContent(imageBytes, "image/jpeg"),
    new UriContent(imageUri, "image/jpeg")
]);

Multimodal supported.

Function call control

new ChatOptions
{
    Tools = [...],
    ToolMode = ChatToolMode.Auto,    // LLM decides
    // ChatToolMode.RequireAny       // must call a tool
    // ChatToolMode.RequireSpecific("name")
    // ChatToolMode.None             // disable
};

Custom middleware

public class RetryChatClient(IChatClient inner, int maxRetries) : DelegatingChatClient(inner)
{
    public override async Task<ChatResponse> GetResponseAsync(...)
    {
        for (int i = 0; i <= maxRetries; i++)
        {
            try { return await base.GetResponseAsync(...); }
            catch (Exception ex) when (i < maxRetries && IsTransient(ex)) { /* retry */ }
        }
    }
}

// Use
chat = chat.AsBuilder().Use(c => new RetryChatClient(c, 3)).Build();

DelegatingChatClient gives you the wrap pattern.

Error handling

try { var r = await chat.GetResponseAsync(messages); }
catch (HttpRequestException) { /* network */ }
catch (TaskCanceledException) { /* timeout */ }

For provider-specific exceptions, check inner exceptions.

Senior considerations

Always include OTel + logging for production.
Caching critical for cost (identical queries common in tests, RAG variants).
Function calling auto vastly simpler than manual JSON.
Wrap with Polly for retries/circuit breaker — pair with resilience handler.