OpenAI Models
Key Points
- GPT-5 family (2026): flagship reasoning + multimodal; most capable.
- GPT-4o: balanced flagship; multimodal (text + image + audio); 128K context.
- GPT-4o mini: 90% of utility at 1/10 cost. Default for cost-sensitive workloads.
- o-series (o1, o3): explicit reasoning; slow but excellent for math/code/logic.
- Embeddings: text-embedding-3-small (cheap), text-embedding-3-large (best).
- Whisper / TTS: audio.
- Direct API or Azure OpenAI (Azure-hosted; SOC2/HIPAA, regional).
Model lineup (2026)
| Model | Strength | Pricing (~$/M input) |
|---|---|---|
| GPT-5 | Top capability | $15-30 |
| GPT-4o | Balanced flagship | $2.50 |
| GPT-4o mini | Cheap workhorse | $0.15 |
| o3 | Strongest reasoning | $20+ |
| o3-mini | Cheap reasoning | $1 |
| text-embedding-3-large | Best embedding | $0.13 |
| text-embedding-3-small | Cheap embedding | $0.02 |
Output tokens ~3-5x input cost. Prompt caching ~10x discount on repeated prefixes.
Strengths
- Code: top-tier across benchmarks.
- Function calling: most mature; strict JSON Schema; parallel.
- Multimodal: text + image + audio (GPT-4o native).
- Tool ecosystem: Custom GPTs, Assistants API.
Weaknesses
- Cost at frontier tier.
- Rate limits — quota pain in early days for new accounts.
- Hallucinations still a thing.
.NET integration
// Direct OpenAI
new OpenAIClient(apiKey).AsChatClient("gpt-4o-mini")
.AsBuilder().UseFunctionInvocation().Build();
// Azure OpenAI (managed identity)
new AzureOpenAIClient(new Uri(endpoint), new DefaultAzureCredential())
.AsChatClient(deploymentName)
.AsBuilder().UseFunctionInvocation().Build();
Both expose IChatClient via Microsoft.Extensions.AI.OpenAI.
Reasoning models
Slower; more output tokens; better at hard tasks.
Function calling
OpenAI strict mode: schema enforced, no malformed responses.
chat.GetResponseAsync(messages, new ChatOptions
{
Tools = [AIFunctionFactory.Create(GetWeather)]
});
Auto-generated schema; auto-dispatched via UseFunctionInvocation().
Vision
var msg = new ChatMessage(ChatRole.User, [
new TextContent("Describe this image:"),
new DataContent(imageBytes, "image/jpeg")
]);
GPT-4o handles. Charges per "high"/"low" detail.
Whisper (audio transcribe)
new OpenAIClient(apiKey).GetAudioClient("whisper-1");
var transcript = await audioClient.TranscribeAudioAsync(audioStream, fileName);
TTS (text-to-speech)
new OpenAIClient(apiKey).GetAudioClient("tts-1");
var audio = await audioClient.GenerateSpeechFromTextAsync("hello", GeneratedSpeechVoice.Alloy);
Image generation (DALL-E / GPT-Image)
new OpenAIClient(apiKey).GetImageClient("dall-e-3");
var img = await imageClient.GenerateImageAsync("a cat in space");
Azure OpenAI specifics
- Provisioned Throughput Units (PTUs): reserved capacity; consistent latency. For high-RPS production.
- Standard tier: pay-per-token; rate-limited.
- Regional availability: not all models in all regions.
- Azure AD auth + Managed Identity: zero-secret access.
- Content filters: built-in moderation; tunable.
- Dedicated capacity for compliance / SLA.
Senior considerations
When OpenAI direct vs Azure OpenAI?
| Need | Pick |
|---|---|
| Newest models day-1 | OpenAI direct |
| SOC2 / HIPAA / data residency | Azure OpenAI |
| Cost flexibility | OpenAI direct (pay-per-use) |
| Reserved capacity | Azure OpenAI PTUs |
| Existing Azure stack | Azure OpenAI |
Quotas
OpenAI direct: account-level rate limits (TPM, RPM). Azure: per-deployment.
Streaming
await foreach (var update in chat.GetStreamingResponseAsync(prompt))
yield return update.Text ?? "";
Both support natively.
Function calling parallel
new ChatOptions
{
Tools = [...],
ToolMode = ChatToolMode.Auto,
AllowMultipleToolCalls = true // OpenAI parallel-tool-calls
};
OpenAI supports calling multiple tools per turn.
Cost optimization
- gpt-4o-mini default.
- Promote to gpt-4o or GPT-5 only when measurably needed.
- Prompt caching: structure prompts with stable prefix.
max_tokenscap.