Google Gemini
Key Points
- Gemini 2.5 Pro / Flash (2026): native multimodal (text, image, video, audio).
- 2M context window in Pro tier — largest commercial.
- Two access paths: Google AI Studio (direct API) and Vertex AI (GCP managed).
- Strengths: multimodal, long context, low cost (Flash tier), code/math.
- Weaknesses: tool ecosystem less mature than OpenAI; .NET SDK less polished.
Lineup
| Model | Strength | Pricing (~$/M input) |
|---|---|---|
| Gemini 2.5 Pro | Multimodal flagship | $3.50 |
| Gemini 2.5 Flash | Cheap workhorse | $0.30 |
| Gemini Flash-8B | Tiny, fastest | $0.05 |
| text-embedding-004 | Embeddings | $0.10 |
Strengths
- Native multimodal: video, audio, image, text in one model. Process YouTube videos, audio files directly.
- 2M context: largest commercial. Big books, long codebases.
- Cost: Flash is excellent value.
- Vertex AI: GCP integration; data residency.
Weaknesses
- Function calling: less mature than OpenAI.
- Refusals / safety: occasionally aggressive.
- .NET ecosystem: smaller than OpenAI/Anthropic.
.NET integration
// Mscc.GenerativeAI community SDK
var google = new GoogleAI(apiKey);
var model = google.GenerativeModel(Model.Gemini25Pro);
var resp = await model.GenerateContent("Hello");
Or via Vertex AI SDK (Google.Cloud.AIPlatform.V1).
For IChatClient wrapper:
Several community implementations exist on NuGet.
Vertex AI vs Google AI
| Aspect | AI Studio | Vertex AI |
|---|---|---|
| Setup | API key | GCP project + IAM |
| Pricing | Public | Same; commercial billing |
| Data residency | Limited | Choose region |
| Auth | API key | Service account / Workload Identity |
| Compliance | Limited | SOC2, ISO, HIPAA |
For production: Vertex AI.
Multimodal calls
var content = new Content(
new Part { Text = "Summarize this video:" },
new Part { FileData = new FileData { MimeType = "video/mp4", FileUri = "gs://..." } });
Long videos / audio: upload to GCS first; reference by URI.
Long context
Gemini Pro accepts ~2M input tokens. Useful for long codebases, legal docs, research.
Function calling
var tool = new Tool
{
FunctionDeclarations = { new FunctionDeclaration { Name = "get_weather", Parameters = ... } }
};
Functional but less polished than OpenAI strict mode.
Code / math
Gemini 2.5 Pro is competitive with GPT-4o on benchmarks. Good for code; not o3-level for math.
Streaming
When choose Gemini
- Need video / audio inputs.
- 2M context required.
- GCP shop.
- Cost (Flash for high volume).
When skip
- Pure text + need richest .NET tooling → OpenAI / Azure OpenAI.
- Best code quality consistently → Claude.
Senior considerations
- Vertex AI for production: data residency + auth via Workload Identity.
- Token limits: even with 2M context, chunks > 1.5M can hit issues.
- Mixed media: be careful with prompt construction; videos are tokenized differently.