OpenAI-compatible Endpoints
Key Points
- Foundry exposes OpenAI-compatible endpoints for many models. Existing OpenAI SDK code works with one config change.
- Same applies to: Together, Groq, Fireworks, Mistral, Ollama, vLLM, Anthropic (via some gateways).
- Massive portability: write code once; route to any provider via config.
- Limitations: vendor-specific features (extended thinking, computer use, etc.) not exposed via this layer.
What is "OpenAI-compatible"
The OpenAI HTTP API (chat completions, embeddings, images) became a de-facto standard. Many providers implement the same wire format.
If your client supports OpenAI's API, point it at any compatible endpoint.
.NET pattern
// OpenAI direct
var openAi = new OpenAIClient(
new ApiKeyCredential(apiKey));
// Azure OpenAI (similar)
var azure = new AzureOpenAIClient(
new Uri(endpoint),
new DefaultAzureCredential());
// Foundry OpenAI-compat (uses same OpenAIClient)
var foundry = new OpenAIClient(
new ApiKeyCredential(apiKey),
new OpenAIClientOptions { Endpoint = new Uri("https://my-foundry.../v1") });
// Together
var together = new OpenAIClient(
new ApiKeyCredential(togetherKey),
new OpenAIClientOptions { Endpoint = new Uri("https://api.together.xyz/v1") });
// Groq
var groq = new OpenAIClient(
new ApiKeyCredential(groqKey),
new OpenAIClientOptions { Endpoint = new Uri("https://api.groq.com/openai/v1") });
Same OpenAIClient. Different endpoint + key. Same model API.
As IChatClient
IChatClient chat = together.AsChatClient("meta-llama/Llama-3.3-70B-Instruct-Turbo");
// Identical pipeline
chat = chat.AsBuilder().UseFunctionInvocation().Build();
Use cases
- Provider switching: try Llama via Together, Mistral via Mistral API, all with same code.
- Cost optimization: route cheap queries to Together, hard ones to OpenAI.
- Multi-region: route by user location.
- Fallback: primary down → switch URL.
DI / config
"AI": {
"Endpoint": "https://api.together.xyz/v1",
"ApiKey": "@KeyVault(...)",
"Model": "llama-3.3-70b"
}
var endpoint = new Uri(config["AI:Endpoint"]!);
var key = new ApiKeyCredential(config["AI:ApiKey"]!);
builder.Services.AddSingleton<IChatClient>(sp =>
new OpenAIClient(key, new() { Endpoint = endpoint })
.AsChatClient(config["AI:Model"]!)
.AsBuilder().UseFunctionInvocation().Build());
Switch provider via config; no code change.
Foundry's OpenAI-compat
Foundry exposes models via OpenAI-compat endpoint:
Auth via Bearer token (managed identity / API key).
var foundryChat = new OpenAIClient(
new ApiKeyCredential(token),
new OpenAIClientOptions { Endpoint = new Uri("https://<project>.services.ai.azure.com/openai/v1") })
.AsChatClient("model-deployment-name");
Caveats
Feature support
OpenAI-compat covers basic chat + embeddings + (sometimes) function calling. Vendor-specific features: - Anthropic Computer Use → not in OpenAI-compat. - Anthropic prompt caching → some support. - Gemini multimodal video → not in OpenAI-compat. - OpenAI Assistants API → vendor-specific.
For these, use vendor-specific SDK.
Rate limits
Each provider has different RPM / TPM. Account for in retry strategy.
Token counting
Some providers don't return usage; some do. Test.
Differences in tool calling
Even with OpenAI-compat, tool format may differ subtly. Test thoroughly.
Senior strategy
Use OpenAI-compat as the default abstraction:
Lets you swap providers cheaply.
For features that REQUIRE vendor-specific: build small adapter; isolate.
Anti-patterns
- ❌ Assuming all features work everywhere via OpenAI-compat.
- ❌ Pinning to one provider's quirks.
- ❌ No fallback strategy.