BrandGhost
Semantic Kernel Vector Store in C#: Azure AI Search, Qdrant, and Beyond

Semantic Kernel Vector Store in C#: Azure AI Search, Qdrant, and Beyond

Vector stores are specialized databases designed to efficiently store and search high-dimensional vector embeddings, making them essential for Retrieval-Augmented Generation (RAG) applications in .NET. When building AI-powered applications, you need a way to store document embeddings, perform semantic similarity searches, and retrieve relevant context for your language models. The Semantic Kernel vector store in C# provides a unified abstraction layer called IVectorStore that lets you work with multiple vector database providers using consistent code, whether you're using Azure AI Search for production workloads or InMemoryVectorStore for local development and testing.

I've worked with various vector stores in my .NET applications, and having a consistent API across different backends has been a game-changer for prototyping and deployment. The Semantic Kernel vector store in C# makes it straightforward to implement these patterns. Let's explore how to use these implementations effectively, from development through production.

The IVectorStore Abstraction

Note: Vector Store functionality in Semantic Kernel is currently in preview. APIs may change in future releases.

Semantic Kernel's VectorStore abstract class provides a unified abstraction for working with different vector database providers, allowing you to switch between implementations without changing your core application logic.This abstraction follows the same connector model that Semantic Kernel uses for AI services, making it easy to integrate vector stores into your dependency injection container using IServiceCollection in C#.

The beauty of this abstraction is that you can develop locally using InMemoryVectorStore, then deploy to production with Azure AI Search or Qdrant by changing just a few lines of registration code. Your CRUD operations, vector searches, and collection management remain identical across all implementations. This portability is crucial when you're building enterprise applications that might need to migrate between cloud providers or scale from proof-of-concept to production.

Semantic Kernel provides official connectors for several popular vector stores including InMemoryVectorStore (built-in), Azure AI Search, Qdrant, Redis, Pinecone, Chroma, Weaviate, and Milvus. Each connector is distributed as a separate NuGet package, so you only pull in the dependencies you actually need for your project.

Defining Vector Store Records

Before you can work with any vector store in Semantic Kernel, you need to define your data model using special attributes that tell the framework how to map your C# properties to vector store fields. The three key attributes you'll use are VectorStoreKey for the unique identifier, VectorStoreData for regular data properties, and VectorStoreVector for embedding fields that will be used in similarity searches.

Here's a complete example of a blog post record that includes metadata and a vector embedding:

using Microsoft.Extensions.VectorData;

public record BlogPost(
    [property: VectorStoreKey] string Id,
    [property: VectorStoreData] string Title,
    [property: VectorStoreData] string Content,
    [property: VectorStoreData] string[] Tags,
    [property: VectorStoreVector(1536)] ReadOnlyMemory<float> ContentEmbedding);

The VectorStoreVector attribute takes a dimension parameter (1536 in this case) that must match the dimension of your embedding model. OpenAI's text-embedding-3-small model produces 1536-dimensional embeddings (the older text-embedding-ada-002 is now deprecated), while text-embedding-3-large can produce 256, 1024, or 3072 dimensions depending on configuration. Using ReadOnlyMemory is more efficient than float[] for large embeddings because it avoids unnecessary array copies.

You can use either record types or regular classes for your vector store models. I prefer records for immutability and concise syntax, but the choice is yours based on your application's needs. The important thing is to annotate your properties correctly so Semantic Kernel knows how to serialize and deserialize your data.

InMemoryVectorStore: Development and Testing

The InMemoryVectorStore is a built-in implementation that requires no external dependencies or infrastructure, making it perfect for local development, unit testing, and rapid prototyping. It stores all vectors in memory and performs brute-force similarity searches, which is completely acceptable for small datasets but won't scale to production volumes.

Here's how to set up an InMemoryVectorStore with a complete working example:

using Microsoft.SemanticKernel;
using Microsoft.Extensions.VectorData;

var builder = Kernel.CreateBuilder();
builder.AddOpenAIChatCompletion("gpt-4o", Environment.GetEnvironmentVariable("OPENAI_API_KEY")!);
builder.AddOpenAITextEmbeddingGeneration("text-embedding-3-small", Environment.GetEnvironmentVariable("OPENAI_API_KEY")!);
builder.Services.AddSingleton<IVectorStore, InMemoryVectorStore>();
var kernel = builder.Build();

var store = kernel.Services.GetRequiredService<IVectorStore>();
var collection = store.GetCollection<string, BlogPost>("blog-posts");
await collection.CreateCollectionIfNotExistsAsync();

var embeddingService = kernel.GetRequiredService<ITextEmbeddingGenerationService>();

The beauty of this setup is that you can start building your RAG application immediately without setting up Docker containers, cloud services, or database connections. I use InMemoryVectorStore for all my initial development and unit tests, then switch to a persistent store when I'm ready to deploy.

For production use, InMemoryVectorStore is only suitable if you have a small, static dataset that fits comfortably in memory (think hundreds or low thousands of items, not millions). It's also appropriate for ephemeral scenarios where you generate embeddings on-the-fly for a single user session and don't need persistence. However, for any serious production workload with large datasets, user-generated content, or need for durability, you'll want to use a persistent vector store like Azure AI Search or Qdrant.

Azure AI Search (formerly known as Azure Cognitive Search) is Microsoft's fully managed search-as-a-service offering that includes powerful vector search capabilities alongside traditional text search. It's an excellent choice for production .NET applications because it integrates seamlessly with Azure's security model, scales automatically, and provides enterprise features like high availability and disaster recovery.

To use Azure AI Search as your vector store, first install the connector package:

dotnet add package Microsoft.SemanticKernel.Connectors.AzureAISearch

Then register the vector store in your dependency injection container:

using Microsoft.SemanticKernel.Connectors.AzureAISearch;

builder.Services.AddAzureAISearchVectorStore(
    new Uri(Environment.GetEnvironmentVariable("AZURE_SEARCH_ENDPOINT")!),
    new Azure.Identity.DefaultAzureCredential());

// Or with API key
builder.Services.AddAzureAISearchVectorStore(
    new Uri(Environment.GetEnvironmentVariable("AZURE_SEARCH_ENDPOINT")!),
    new Azure.AzureKeyCredential(Environment.GetEnvironmentVariable("AZURE_SEARCH_KEY")!));

I recommend using DefaultAzureCredential for production applications because it supports managed identities and eliminates the need to manage API keys. During local development, it automatically falls back to your Visual Studio or Azure CLI credentials, making the developer experience seamless.

Azure AI Search automatically creates the index schema based on your vector store record attributes when you call CreateCollectionIfNotExistsAsync. This works well for standard configurations, though production deployments may require manual index tuning for optimal performance -- see Azure AI Search index configuration for tuning options. The service supports multiple vector search algorithms including HNSW (Hierarchical Navigable Small World) for fast approximate nearest neighbor search -- see Azure AI Search vector search ranking for algorithm performance details.

One of Azure AI Search's standout features is its ability to combine vector similarity search with traditional keyword search, filters, and facets in a single query. This hybrid search capability is crucial for real-world applications where users might want to find semantically similar documents but also filter by date, author, category, or other metadata fields. We'll explore this more in the hybrid search section.

Qdrant: Open-Source Alternative

Qdrant is a high-performance, open-source vector database built specifically for similarity search at scale. It's written in Rust for maximum performance and offers a rich feature set including filtering, payload indexing, and quantization for reduced memory usage. Qdrant is an excellent choice if you want full control over your infrastructure, need to run on-premises, or prefer open-source solutions.

For local development, the easiest way to run Qdrant is with Docker:

docker run -p 6333:6333 qdrant/qdrant

Once Qdrant is running, install the Semantic Kernel connector:

dotnet add package Microsoft.SemanticKernel.Connectors.Qdrant

Then register it in your application:

using Microsoft.SemanticKernel.Connectors.Qdrant;

builder.Services.AddQdrantVectorStore(
    host: "localhost",
    port: 6333,
    https: false);

// For production Qdrant Cloud
builder.Services.AddQdrantVectorStore(
    host: "your-cluster.qdrant.io",
    port: 6333,
    https: true,
    apiKey: Environment.GetEnvironmentVariable("QDRANT_API_KEY"));

Qdrant Cloud is the managed service option if you want the benefits of open-source software without the operational overhead of self-hosting. It offers the same API as self-hosted Qdrant, so your code remains identical between development and production environments.

For production deployments, you'll need to consider factors like backup strategies, monitoring, scaling policies, and high availability configuration. Qdrant supports clustering and replication for fault tolerance, but you'll need to manage these aspects yourself if you're self-hosting. Qdrant Cloud handles these concerns for you, similar to how Azure AI Search provides a fully managed experience.

CRUD Operations: Upsert, Get, Delete

Once you have a collection instance, performing CRUD operations on your vector store is straightforward and consistent across all connector implementations. The primary operations you'll use are UpsertAsync for creating or updating records, GetAsync for retrieving individual records by key, and DeleteAsync for removing records.

Here's how to insert or update a blog post with its embedding:

var content = "Async programming in C# allows you to write non-blocking code...";
var embedding = await embeddingService.GenerateEmbeddingAsync(content);

var post = new BlogPost(
    Id: "post-001",
    Title: "Async/Await in C#",
    Content: content,
    Tags: new[] { "csharp", "async", "dotnet" },
    ContentEmbedding: embedding);

await collection.UpsertAsync(post);

The UpsertAsync method is idempotent, meaning you can call it repeatedly with the same ID and it will either create a new record or update the existing one. This behavior is perfect for scenarios where you're syncing data from another source and don't want to check for existence before every write.

To retrieve a record by its key:

var retrieved = await collection.GetAsync("post-001");
if (retrieved != null)
{
    Console.WriteLine($"Found: {retrieved.Title}");
}

And to delete a record:

await collection.DeleteAsync("post-001");

For batch operations, you can use the batch methods to improve performance when working with multiple records:

var posts = new List<BlogPost> { post1, post2, post3 };
await collection.UpsertBatchAsync(posts);

var keys = new[] { "post-001", "post-002", "post-003" };
await foreach (var post in collection.GetBatchAsync(keys))
{
    Console.WriteLine($"Retrieved: {post?.Title}");
}

await collection.DeleteBatchAsync(keys);

Batch operations are significantly more efficient than individual operations when working with multiple records because they reduce network round trips and allow the vector store to optimize the operation internally. I always use batch methods when I'm processing datasets during indexing or bulk updates.

Vector Search: SearchAsync

The real power of vector stores comes from semantic similarity search, where you can find records that are conceptually similar to a query even if they don't share exact keywords. The SearchAsync method performs this search using a query embedding and returns results ranked by similarity score.

Here's a complete example showing how to perform a vector search with filtering:

var queryEmbedding = await embeddingService.GenerateEmbeddingAsync("async programming in C#");

var searchOptions = new VectorSearchOptions<BlogPost>
{
    Filter = new VectorSearchFilter().EqualTo(nameof(BlogPost.Tags), "csharp"),
    VectorPropertyName = nameof(BlogPost.ContentEmbedding)
};

var results = await collection.SearchAsync(queryEmbedding, top: 5, searchOptions);

await foreach (var result in results.Results)
{
    Console.WriteLine($"Score: {result.Score:F3} | Title: {result.Record.Title}");
}

The similarity score typically ranges from 0 to 1, where higher scores indicate greater similarity. The exact scoring algorithm depends on the vector store implementation and configuration, but generally scores above 0.8 indicate very similar content, while scores below 0.5 might not be relevant to your query.

The VectorSearchFilter class allows you to combine vector similarity with metadata filtering, which is incredibly powerful for real-world applications. You might want to find similar blog posts but only from a specific date range, or find relevant products but filtered by price range and availability. The filter syntax supports equality comparisons, range queries, and collection membership checks depending on your vector store's capabilities.

One important consideration is that you're passing a pre-computed embedding to SearchAsync, not raw text. This design gives you flexibility in how you generate embeddings -- you might want to cache embeddings for common queries, use different embedding models for queries versus documents, or even let users search with images or audio by converting them to embeddings.

Hybrid search combines vector similarity search with traditional keyword-based search to provide more relevant results than either approach alone. Vector search excels at finding semantically similar content but can miss exact term matches, while keyword search is great for precise terminology but misses semantic variations. Combining both approaches gives you the best of both worlds.

Azure AI Search has excellent built-in support for hybrid search through its Hybrid Search API. You can specify both a vector query and a text query in the same request, and Azure AI Search will merge the results using reciprocal rank fusion or other ranking algorithms. This is particularly valuable for technical documentation, legal documents, or medical content where exact terminology matters but semantic understanding is also important.

Here's how hybrid search works with the required cast:

var queryEmbedding = await embeddingService.GenerateEmbeddingAsync("dependency injection patterns");
var hybridSearchable = (IKeywordHybridSearchable<BlogPost>)collection;
var hybridResults = await hybridSearchable.HybridSearchAsync(
    queryEmbedding,
    "dependency injection",
    top: 10);

Note that hybrid search support varies by vector store implementation. Azure AI Search provides the most comprehensive hybrid search capabilities out of the box, while other vector stores might require you to perform vector search and keyword search separately and merge the results yourself.

I use hybrid search for my blog's semantic search feature because readers sometimes search for exact phrases like "async/await" or specific class names, but other times they use natural language queries like "how do I make my code faster". The hybrid approach handles both query styles effectively without requiring me to detect which type of search the user intends.

When implementing hybrid search in your RAG pipeline, you'll typically want to tune the relative weights of vector versus keyword results based on your specific domain and user feedback. Start with equal weights and adjust based on relevance metrics and user satisfaction. For more context on building RAG applications, see my complete guide to RAG with Semantic Kernel C#.

Choosing the Right Vector Store

Selecting the appropriate vector store for your .NET application depends on several factors including dataset size, infrastructure preferences, budget constraints, performance requirements, and operational capabilities. Let me walk through the key considerations and provide a comparison table to help you decide.

For local development and testing, InMemoryVectorStore is the obvious choice because it requires zero setup and works great for datasets under 10,000 items. Once you move to production, the choice becomes more nuanced based on your specific requirements and constraints.

If you're already invested in the Azure ecosystem and want a fully managed service with minimal operational overhead, Azure AI Search is an excellent choice. It provides enterprise-grade reliability, automatic scaling, built-in security, and seamless integration with other Azure services. The pricing is consumption-based, so you pay for what you use, though costs can add up for large-scale deployments.

Qdrant is ideal if you prefer open-source solutions, need maximum performance, or have specific infrastructure requirements that preclude cloud services. The self-hosted option gives you complete control but requires more operational expertise. Qdrant Cloud provides a middle ground with managed hosting but the flexibility of open-source software.

Redis with the RedisSearch module is worth considering if you're already using Redis for caching or session management. It provides good vector search performance and lets you consolidate infrastructure, though its vector search capabilities are less mature than purpose-built vector databases.

Pinecone is a vector database specialist with excellent performance and a generous free tier, making it attractive for startups and prototypes. However, it's a proprietary cloud service, so you're locked into their platform and pricing model.

Here's a comparison table to help visualize the trade-offs:

Vector Store Best For Scale Infrastructure Cost Model .NET Support
InMemoryVectorStore Development, testing, small static datasets Up to 10K items None required Free Excellent (built-in)
Azure AI Search Enterprise Azure applications Millions of items Fully managed Consumption-based Excellent
Qdrant Self-hosted, open-source preference Billions of items Self-host or managed Open-source or subscription Excellent
Redis Existing Redis users Millions of items Self-host or managed Open-source or subscription Good
Pinecone Prototypes, startups Millions of items Fully managed Free tier + usage-based Good

My general recommendation is to start with InMemoryVectorStore during development, then choose between Azure AI Search and Qdrant based on whether you prefer fully managed services or want more control. Both are excellent production choices with strong .NET support through Semantic Kernel.

FAQ

Can I use multiple vector stores in the same application?

Yes, you can register multiple IVectorStore implementations in your dependency injection container and resolve them by type or name. This is useful if you want to use different stores for different types of data or maintain separate stores for different tenants in a multi-tenant application. The Semantic Kernel vector store in C# abstraction makes this straightforward with its connector model.

How do I handle embedding dimension mismatches?

If you change embedding models and the dimensions don't match, you'll need to re-embed all your documents with the new model. Semantic Kernel will throw an exception if you try to insert a vector with dimensions that don't match the collection schema. Plan your embedding model choice carefully because migrations are expensive.

What's the performance difference between InMemoryVectorStore and production stores?

InMemoryVectorStore uses brute-force similarity search with O(n) complexity, comparing your query against every stored vector. Production vector stores use optimized algorithms like HNSW with approximate O(log n) complexity, providing orders of magnitude better performance at scale. For small datasets (under 1,000 items), the difference is negligible, but it becomes critical as your data grows.

How do I handle schema changes in production vector stores?

Schema evolution depends on your vector store implementation. Azure AI Search supports adding new fields to existing indexes, but changing vector dimensions requires creating a new index. Most vector stores don't support in-place schema migrations, so plan for blue-green deployments or re-indexing strategies when you need to change your data model.

Conclusion

The Semantic Kernel vector store in C# provides a powerful, unified API for working with various vector databases in your .NET applications. By using the VectorStore abstraction and consistent patterns across implementations, you can develop locally with InMemoryVectorStore and deploy to production with confidence using Azure AI Search, Qdrant, or other connectors.

The key to success with the Semantic Kernel vector store in C# is understanding the trade-offs between different options and choosing the right tool for your specific requirements. Start simple with InMemoryVectorStore during development, define your data models carefully with proper attributes, and graduate to a production vector store when you're ready to scale. The abstraction layer ensures your core application logic remains unchanged regardless of which backend you choose.

For more context on how vector stores fit into the broader picture of building AI applications with .NET, check out my complete guide to Semantic Kernel C#. Understanding async/await patterns is also essential since all vector store operations in Semantic Kernel are asynchronous.

The vector store landscape continues to evolve rapidly, with new connectors and features being added regularly to Semantic Kernel. I recommend staying current with the latest releases and experimenting with different options to find what works best for your specific use cases. Building robust RAG applications requires choosing the right foundation, and Semantic Kernel's vector store abstraction gives you the flexibility to make that choice without locking yourself into a single vendor or technology.

Semantic Kernel in C#: Complete AI Orchestration Guide

Master Semantic Kernel in C# with this complete guide. Learn plugins, agents, RAG, and vector stores to build production AI applications with .NET.

RAG with Semantic Kernel in C#: Complete Guide to Retrieval-Augmented Generation

Master RAG with Semantic Kernel in C# using vector stores, embeddings, and InMemoryVectorStore. Complete guide with working .NET code examples.

Build an AI Task Planner with Semantic Kernel in C#

Build an AI task planner with Semantic Kernel in C# using KernelFunctionFactory, Kernel.InvokeAsync, KernelArguments, and structured JSON pipelines.

An error has occurred. This application may no longer respond until reloaded. Reload