Vendor Trade-offs & Lock-In
Key Points
- Lock-in vectors: model-specific features, prompts tuned to one model, vendor-specific tools (Assistants API, Vertex Search), regional availability, contracts.
- Mitigation:
IChatClientabstraction, vendor-portable prompts, multi-provider routing, prompt-injection-aware design. - Multi-provider strategy: primary + fallback for resilience; cheap + smart for cost; redundant for high SLO.
- Compliance: SOC2, HIPAA, GDPR, BAAs vary by provider and region. Match to your needs.
Lock-in vectors
1. Model-specific features
- OpenAI Assistants API.
- Vertex Search.
- Anthropic Computer Use.
Use only when YOUR app needs it. Wrap minimally.
2. Prompt tuning
Prompts that work great in one model may fail in another. "Just plug Claude" is rarely seamless.
Mitigation: prompts in config; A/B test on switch.
3. Tool/function-call formats
Strict JSON Schema (OpenAI) vs. tool_use (Anthropic) vs. functions (Gemini). Microsoft.Extensions.AI abstracts; per-vendor quirks remain.
4. Pricing changes
Vendors raise/lower prices unilaterally. Multi-provider mitigates.
5. Quotas & availability
New OpenAI features sometimes restricted. Azure OpenAI has staged rollouts. Anthropic limited rate limits.
6. Regional / data residency
- Azure OpenAI: data stays in Azure region.
- OpenAI direct: routes through US.
- Vertex AI: per-region.
- AWS Bedrock: per-region.
For EU / regulated, choose accordingly.
7. Compliance / certifications
| Vendor | SOC2 | HIPAA | GDPR | BAA |
|---|---|---|---|---|
| Azure OpenAI | ✅ | ✅ | ✅ | ✅ (with agreement) |
| OpenAI direct | ✅ | (Enterprise) | ✅ | (limited) |
| AWS Bedrock | ✅ | ✅ | ✅ | ✅ |
| Anthropic API | ✅ | (Enterprise) | ✅ | (Enterprise) |
| GCP Vertex | ✅ | ✅ | ✅ | ✅ |
For healthcare: Azure OpenAI / Bedrock / GCP with BAA.
Multi-provider strategies
1. Primary + fallback
public async Task<ChatResponse> GetWithFallback(IEnumerable<ChatMessage> msgs, ChatOptions? opts, CancellationToken ct)
{
try { return await _primary.GetResponseAsync(msgs, opts, ct); }
catch (Exception ex) when (IsTransient(ex))
{
return await _fallback.GetResponseAsync(msgs, opts, ct);
}
}
2. Cheap default + smart escalation
Per-request: easy → cheap; hard → expensive.
var resp = await _cheap.GetResponseAsync(msgs, opts, ct);
if (NeedsMoreThought(resp))
return await _smart.GetResponseAsync(msgs, opts, ct);
return resp;
3. Tenant-tier routing
Free tier: gpt-4o-mini. Premium: GPT-5.
4. Geographic routing
EU users: Azure OpenAI EU region. US users: US.
5. A/B testing
Roll out new model to 5% of traffic; measure quality + cost.
Cost vs lock-in trade-off
More lock-in → cheaper / more capable (tied to vendor optimizations)
Less lock-in → flexibility; some perf left on table
Senior choice: lock-in OK for prototyping; build abstraction before scale.
Migration playbook
When you decide to swap providers:
- Audit prompts for vendor-specific assumptions.
- Build adapter if not using
IChatClient. - Eval suite runs against new provider.
- A/B test in production.
- Cutover with monitoring.
- Keep old as fallback for N weeks.
Allow 2-4 weeks for non-trivial systems.
Anti-patterns
- ❌ "Locked in forever" because of code coupling.
- ❌ Single provider; outage cripples app.
- ❌ Hardcoded model strings.
- ❌ Prompts that only work on one vendor.
- ❌ No abstraction layer.
Senior strategies
Build for portability from day 1
// Don't:
var resp = await _openAi.GetChatCompletionsAsync(...);
// Do:
var resp = await _chat.GetResponseAsync(messages, options);
Evaluate quarterly
Models evolve. Today's pick may be displaced. Run evals against new options every quarter.
Negotiate contracts
For volume customers: enterprise contracts (Azure OpenAI Enterprise; Anthropic Enterprise). SLA, capacity guarantees, BAAs.
Cost-cap per tenant
Hard caps prevent runaway. Budget alerts at 80%.
Track usage by provider
// Tag spans
activity?.SetTag("gen_ai.system", "openai");
activity?.SetTag("gen_ai.request.model", "gpt-4o");
OTel splits cost by provider in dashboards.
Real-world hybrid examples
Production app:
- Azure OpenAI gpt-4o for chat (compliance, integration)
- Anthropic Claude for code generation (better quality)
- Together AI Llama for cheap classification
- text-embedding-3-small for embeddings
- Cohere Rerank for reranking
- Whisper local for offline audio
Each chosen for fit.
Senior consideration
The AI vendor landscape will reorder several times in the next few years. Don't lock in. Don't over-engineer abstractions either — IChatClient + IEmbeddingGenerator + careful prompt design is enough portability.