Store: Cosmos DB
Key Points
- Cosmos DB NoSQL has native vector indexing (.NET 9 + 2025 GA).
- Use when: vectors collocated with operational data; multi-region writes; existing Cosmos.
- Combine relational + vector in one container.
- Trade-off: cost (RUs); less mature than Azure AI Search hybrid.
Setup
var client = new CosmosClient(uri, new DefaultAzureCredential());
var database = client.GetDatabase("mydb");
var props = new ContainerProperties
{
Id = "chunks",
PartitionKeyPath = "/tenantId",
VectorEmbeddingPolicy = new VectorEmbeddingPolicy(new[]
{
new Embedding { Path = "/embedding", DataType = VectorDataType.Float32, DistanceFunction = DistanceFunction.Cosine, Dimensions = 1536 }
}),
IndexingPolicy = new IndexingPolicy
{
VectorIndexes = { new VectorIndexPath { Path = "/embedding", Type = VectorIndexType.QuantizedFlat } }
}
};
await database.CreateContainerIfNotExistsAsync(props);
Document
public class Chunk
{
public string Id { get; set; } = Guid.NewGuid().ToString();
public string TenantId { get; set; } = "";
public string DocumentId { get; set; } = "";
public string Text { get; set; } = "";
public float[] Embedding { get; set; } = [];
}
Query
var query = new QueryDefinition(
"SELECT TOP 10 c.id, c.text, VectorDistance(c.embedding, @q) AS score " +
"FROM c " +
"WHERE c.tenantId = @tenant " +
"ORDER BY VectorDistance(c.embedding, @q)")
.WithParameter("@q", queryEmbedding.Vector.ToArray())
.WithParameter("@tenant", tenantId);
var results = container.GetItemQueryIterator<dynamic>(query);
VectorDistance is the native Cosmos function.
Microsoft.Extensions.VectorData adapter
var collection = new CosmosNoSqlCollection<string, Chunk>(database, "chunks");
await foreach (var r in collection.SearchAsync(queryEmbedding, top: 10))
{ /* ... */ }
Index types
| Type | Notes |
|---|---|
Flat | Brute force; small datasets |
QuantizedFlat | Compressed; medium |
DiskANN | Fast at scale |
Hybrid
Combine vector + filter in single query. Cosmos supports filtering before vector search.
For full-text + vector: use Cosmos full-text index alongside.
Multi-region
Cosmos's strength: multi-region writes. Vector data replicated. Reads from nearest region.
Cost
- Vector storage: per byte.
- Vector search: RUs per operation. Higher than basic queries.
For 10M vectors: significant. Provisioned vs serverless trade-off.
Use cases
- E-commerce: products with embeddings; vector + filters in one query.
- Multi-tenant: tenant filter + vector.
- Global apps: multi-region writes critical.
- Existing Cosmos: avoid adding another service.
When NOT
- Pure RAG at scale: dedicated vector DB cheaper.
- Hybrid quality matters more: Azure AI Search.
Senior considerations
- Partition key: high cardinality + vector locality.
- RU monitoring: vector ops can spike.
- Index policies: tune for write vs read perf.
- Migration from old containers: vector index added in-place.