IChatClient & Pipeline
Key Points
IChatClientis the canonical .NET chat abstraction. Vendor-neutral.- Pipeline pattern: chain middleware with
AsBuilder(). EachUse*wraps the next. - Built-in middleware: function invocation, logging, OpenTelemetry, distributed cache, options telemetry.
- Streaming via
GetStreamingResponseAsyncreturnsIAsyncEnumerable<ChatResponseUpdate>. - Function calling: declare functions with
[Description];UseFunctionInvocation()middleware auto-dispatches.
The IChatClient interface
public interface IChatClient : IDisposable
{
Task<ChatResponse> GetResponseAsync(
IEnumerable<ChatMessage> messages,
ChatOptions? options = null,
CancellationToken cancellationToken = default);
IAsyncEnumerable<ChatResponseUpdate> GetStreamingResponseAsync(
IEnumerable<ChatMessage> messages,
ChatOptions? options = null,
CancellationToken cancellationToken = default);
object? GetService(Type serviceType, object? serviceKey = null);
ChatClientMetadata Metadata { get; }
}
Every concrete client (OpenAI, Azure OpenAI, Anthropic adapter, Ollama, ONNX) implements this.
Basic usage
IChatClient chat = new OpenAIClient(apiKey).AsChatClient("gpt-4o-mini");
var response = await chat.GetResponseAsync("Hello!");
Console.WriteLine(response.Text);
Streaming
await foreach (var update in chat.GetStreamingResponseAsync("Tell me a story"))
{
Console.Write(update.Text);
}
ChatResponseUpdate has incremental text + role + completion metadata.
Multi-turn
var messages = new List<ChatMessage>
{
new(ChatRole.System, "You are a helpful assistant."),
new(ChatRole.User, "What's 2+2?"),
new(ChatRole.Assistant, "4."),
new(ChatRole.User, "Multiplied by 10?")
};
var resp = await chat.GetResponseAsync(messages);
ChatOptions
new ChatOptions
{
Temperature = 0.2f,
MaxOutputTokens = 500,
StopSequences = ["END"],
Tools = [AIFunctionFactory.Create(GetWeather)],
ToolMode = ChatToolMode.Auto,
ResponseFormat = ChatResponseFormat.Json,
AdditionalProperties = new() { ["seed"] = 42 }
}
AdditionalProperties for vendor-specific (reasoning_effort, etc.).
Pipeline (AsBuilder)
chat = chat.AsBuilder()
.UseFunctionInvocation()
.UseLogging(loggerFactory)
.UseOpenTelemetry(sourceName: "Application", configure: c => c.EnableSensitiveData = false)
.UseDistributedCache(cache)
.Build();
Each Use* returns a new builder; Build() produces the final IChatClient.
Pipeline executes outside-in:
Function calling
[Description("Gets the current weather for a city")]
async Task<Weather> GetWeather([Description("City name")] string city, CancellationToken ct = default)
{
var w = await _service.GetAsync(city, ct);
return new Weather(w.Temp, w.Condition);
}
// Use:
var resp = await chat.GetResponseAsync(
"What's the weather in Seattle?",
new ChatOptions { Tools = [AIFunctionFactory.Create(GetWeather)] });
UseFunctionInvocation() middleware: 1. Sees the LLM's tool-call response. 2. Dispatches to the .NET method. 3. Adds tool result to messages. 4. Calls LLM again. 5. Returns final answer.
Multiple iterations until LLM stops calling tools.
Function from instance methods
public class Calculator
{
[Description("Adds two numbers")]
public int Add(int a, int b) => a + b;
}
new ChatOptions
{
Tools = [AIFunctionFactory.Create(_calculator.Add)]
};
DI registration
builder.Services.AddSingleton<IChatClient>(sp =>
{
var client = new OpenAIClient(apiKey).AsChatClient("gpt-4o-mini");
return client.AsBuilder()
.UseFunctionInvocation()
.UseLogging(sp.GetRequiredService<ILoggerFactory>())
.UseOpenTelemetry()
.UseDistributedCache(sp.GetRequiredService<IDistributedCache>())
.Build();
});
Or named clients via keyed services for multi-provider setups.
ChatMessage content
new ChatMessage(ChatRole.User, [
new TextContent("Describe this image:"),
new DataContent(imageBytes, "image/jpeg"),
new UriContent(imageUri, "image/jpeg")
]);
Multimodal supported.
Function call control
new ChatOptions
{
Tools = [...],
ToolMode = ChatToolMode.Auto, // LLM decides
// ChatToolMode.RequireAny // must call a tool
// ChatToolMode.RequireSpecific("name")
// ChatToolMode.None // disable
};
Custom middleware
public class RetryChatClient(IChatClient inner, int maxRetries) : DelegatingChatClient(inner)
{
public override async Task<ChatResponse> GetResponseAsync(...)
{
for (int i = 0; i <= maxRetries; i++)
{
try { return await base.GetResponseAsync(...); }
catch (Exception ex) when (i < maxRetries && IsTransient(ex)) { /* retry */ }
}
}
}
// Use
chat = chat.AsBuilder().Use(c => new RetryChatClient(c, 3)).Build();
DelegatingChatClient gives you the wrap pattern.
Error handling
try { var r = await chat.GetResponseAsync(messages); }
catch (HttpRequestException) { /* network */ }
catch (TaskCanceledException) { /* timeout */ }
For provider-specific exceptions, check inner exceptions.
Senior considerations
- Always include OTel + logging for production.
- Caching critical for cost (identical queries common in tests, RAG variants).
- Function calling auto vastly simpler than manual JSON.
- Wrap with Polly for retries/circuit breaker — pair with resilience handler.