Build a Semantic Search Engine with Semantic Kernel in C#

Q: Can I use this without Azure OpenAI?

Yes. Set `"Type": "openai"` in `EmbeddingAI`, provide an `ApiKey`, and use a model like `text-embedding-3-small`. Remove the `Endpoint` field. `AddOpenAITextEmbeddingGeneration` works identically. The rest of the app is provider-agnostic.

Q: When should I use InMemoryVectorStore vs. a persistent backend?

`InMemoryVectorStore` is for development and demos -- all data is lost on process exit. For production, swap it for Azure AI Search, Qdrant, or another SK connector. You re-index at startup if the corpus is small enough. For large corpora (thousands of items), use a persistent backend and index once. The [Semantic Kernel vector store article](https://www.devleader.ca/2026/03/14/semantic-kernel-vector-store-in-c-azure-ai-search-qdrant-and-beyond) covers the production options.

Q: How does the lambda filter work in VectorSearchOptions in v10?

`VectorSearchOptions .Filter` in `Microsoft.Extensions.VectorData` v10.0.0 is `Expression >`. You assign a standard LINQ lambda: `Filter = item => item.Category == "howto"`. This replaces the older `VectorSearchFilter().EqualTo(field, value)` API from earlier versions. It compiles at build time, and the vector store translates it to its native filter syntax (SQL predicate for relational backends, filter expression for Azure AI Search, etc.).

03/18/2026

Semantic Kernel C#.NET semantic search ITextEmbeddingGenerationService VectorStoreCollection vector search embeddings InMemoryVectorStore

Build a Semantic Search Engine with Semantic Kernel in C#

Keyword search can't find "show me how to configure authentication" when the answer is titled "Setting Up JWT Token Validation." The words don't match, but the meaning does. Building a semantic search engine with Semantic Kernel in C# solves this by encoding both queries and corpus items as vectors and ranking results by meaning-distance -- no keyword overlap required.

This article walks through a working .NET 9 console app that loads a structured JSON corpus, batch-embeds each item using ITextEmbeddingGenerationService, stores the embeddings in an InMemoryVectorStore, and performs similarity search with optional category filtering via VectorSearchOptions<T>. Every class is shown with real code from the working project.

The complete source is in semantic-kernel-examples/ai-semantic-search. The app runs against the included sample corpus of 20 FAQ items or any JSON corpus you point it at.

Semantic Search vs. RAG -- Knowing Which One You Need

These two patterns are often conflated but solve different problems. RAG with Semantic Kernel in C# retrieves context and then calls an LLM to generate a synthesized answer from that context. A semantic search engine retrieves and ranks -- it returns the best-matching items from your corpus, scored, without synthesizing anything new.

Semantic search is the right choice when:

You want to return existing records (FAQ items, documentation pages, products, tickets)
Scores and ranking matter to your UI
You need category or metadata filtering alongside the vector search
You do not need the model to generate prose -- the corpus items are the answer

For this app, no chat completion model is configured at all. The only AI service is ITextEmbeddingGenerationService. This is intentional -- it shows the vector store layer independently of the full RAG pipeline.

Project Setup

The app uses the same package set as the Document Q&A app:

<PackageReference Include="Microsoft.SemanticKernel" Version="1.72.0" />
<PackageReference Include="Microsoft.SemanticKernel.Connectors.InMemory" Version="1.72.0-preview" />
<PackageReference Include="Microsoft.Extensions.Configuration.Json" Version="9.0.2" />
<PackageReference Include="Microsoft.Extensions.Configuration.Binder" Version="9.0.2" />
<PackageReference Include="Microsoft.Extensions.VectorData.Abstractions" Version="10.0.0" />

Configuration requires only the EmbeddingAI section -- no ChatAI:

{
  "EmbeddingAI": {
    "Type": "azureopenai",
    "ModelId": "text-embedding-ada-002",
    "Endpoint": "https://your-resource.openai.azure.com/",
    "ApiKey": ""
  }
}

The text embeddings with Semantic Kernel in C# article covers the difference between text-embedding-ada-002, text-embedding-3-small, and text-embedding-3-large and when to choose each.

Defining the Searchable Item

SearchItem is the vector store record. Each item has an ID, title, body text, an optional category string, and the embedding vector:

using Microsoft.Extensions.VectorData;

public sealed class SearchItem
{
    [VectorStoreKey]
    public string Id { get; set; } = "";

    [VectorStoreData]
    public string Title { get; set; } = "";

    [VectorStoreData]
    public string Body { get; set; } = "";

    // Optional category for filtered search (e.g., "concept", "howto", "troubleshooting")
    [VectorStoreData]
    public string Category { get; set; } = "";

    // 1536 dimensions: compatible with text-embedding-ada-002 and text-embedding-3-small
    [VectorStoreVector(1536, DistanceFunction = DistanceFunction.CosineSimilarity)]
    public ReadOnlyMemory<float> Embedding { get; set; }
}

One change from earlier SK versions: IsFilterable was removed as a property of [VectorStoreData] in Microsoft.Extensions.VectorData v10.0.0. In v10, filtering is expressed via Expression<Func<T, bool>> directly in VectorSearchOptions<T> -- the attribute does not need to opt in. The Semantic Kernel vector store article covers the full v10 attribute changes in context.

The Corpus Format

The sample corpus is a JSON array. Any structured knowledge base can be converted to this format:

[
  {
    "Id": "sk-001",
    "Title": "What is Semantic Kernel?",
    "Body": "Semantic Kernel is an open-source SDK from Microsoft...",
    "Category": "concept"
  },
  {
    "Id": "sk-006",
    "Title": "Why is my Semantic Kernel plugin not being called?",
    "Body": "Common causes: FunctionChoiceBehavior is set to None...",
    "Category": "troubleshooting"
  }
]

The sample corpus has 20 items across two topic domains (Semantic Kernel and dependency injection) and three categories (concept, howto, troubleshooting). The categories enable filtered search -- you can restrict results to troubleshooting articles only, for example.

Batch Embedding the Corpus

CorpusIndexer embeds all items in a single batch API call and stores them in the vector collection:

public sealed class CorpusIndexer
{
    private const string CollectionName = "corpus";
    private readonly ITextEmbeddingGenerationService _embeddingService;
    private readonly VectorStoreCollection<string, SearchItem> _collection;

    public CorpusIndexer(
        ITextEmbeddingGenerationService embeddingService,
        VectorStore vectorStore)
    {
        _embeddingService = embeddingService;
        _collection = vectorStore.GetCollection<string, SearchItem>(CollectionName);
    }

    public async Task IndexAsync(
        IEnumerable<SearchItem> items,
        CancellationToken cancellationToken = default)
    {
        await _collection.EnsureCollectionExistsAsync(cancellationToken);

        var itemList = items.ToList();

        // Embed title + body together for richer semantic representation
        var texts = itemList.Select(item => $"{item.Title}
{item.Body}").ToList();
        var embeddings = await _embeddingService
            .GenerateEmbeddingsAsync(texts, cancellationToken: cancellationToken);

        for (int i = 0; i < itemList.Count; i++)
        {
            itemList[i].Embedding = embeddings[i];
            await _collection.UpsertAsync(itemList[i], cancellationToken: cancellationToken);
        }
    }
}

Embedding "Title Body" instead of just the body gives the embedding model the full context of each item. A title like "Why is my plugin not being called?" carries more diagnostic signal than the body alone. The batch call to GenerateEmbeddingsAsync sends all texts in one request -- efficient for startup indexing whether the corpus has 20 items or 2000.

VectorStore.GetCollection<string, SearchItem>(CollectionName) resolves to the corpus collection. When you swap InMemoryVectorStore for Azure AI Search in production, this line doesn't change -- only the DI registration changes.

Semantic Search with Scoring and Filtering

SemanticSearchEngine embeds the query and returns top-k results, optionally filtered by category:

public sealed class SemanticSearchEngine
{
    private readonly ITextEmbeddingGenerationService _embeddingService;
    private readonly VectorStoreCollection<string, SearchItem> _collection;

    public async Task<IReadOnlyList<SearchResult>> SearchAsync(
        string query,
        int topK = 5,
        string? categoryFilter = null,
        CancellationToken cancellationToken = default)
    {
        var queryEmbeddings = await _embeddingService
            .GenerateEmbeddingsAsync([query], cancellationToken: cancellationToken);

        var queryVector = queryEmbeddings[0];

        // v10 API: Filter is Expression<Func<T, bool>> -- a lambda expression
        VectorSearchOptions<SearchItem>? options = null;
        if (!string.IsNullOrWhiteSpace(categoryFilter))
        {
            var cat = categoryFilter;
            options = new VectorSearchOptions<SearchItem>
            {
                Filter = item => item.Category == cat
            };
        }

        var searchable = (IVectorSearchable<SearchItem>)_collection;
        var results = searchable.SearchAsync(queryVector, topK, options, cancellationToken);

        var output = new List<SearchResult>();
        await foreach (var result in results.WithCancellation(cancellationToken))
            output.Add(new SearchResult(result.Record, result.Score ?? 0.0));

        return output;
    }
}

The VectorSearchOptions<T>.Filter property in Microsoft.Extensions.VectorData v10.0.0 is Expression<Func<T, bool>> -- a standard LINQ lambda expression. This is cleaner than the older VectorSearchFilter().EqualTo(...) fluent syntax: it's type-safe, checked by the compiler, and works with IntelliSense. Passing null for options skips filtering entirely.

result.Score is double? (nullable) in v10 -- use ?? 0.0 as a safe default. Scores are cosine similarity values from 0 (unrelated) to 1 (identical meaning). In practice, scores above 0.85 indicate strong semantic overlap; 0.6–0.85 is related; below 0.6 is weak relevance.

Wiring the App in Program.cs

Kernel setup for an embedding-only app is simpler than the full RAG case -- no chat service is registered. The key difference is that GetRequiredService<ITextEmbeddingGenerationService>() is the only AI service call at runtime. Everything else is vector store plumbing.

var builder = Kernel.CreateBuilder();

// Only embedding service needed -- no chat completion
builder.AddAzureOpenAITextEmbeddingGeneration(
    deploymentName: embeddingConfig.ModelId,
    endpoint: embeddingConfig.Endpoint,
    apiKey: embeddingConfig.ApiKey);

builder.Services.AddSingleton<VectorStore, InMemoryVectorStore>();

var kernel = builder.Build();

var embeddingService = kernel.GetRequiredService<ITextEmbeddingGenerationService>();
var vectorStore = kernel.Services.GetRequiredService<VectorStore>();

var indexer = new CorpusIndexer(embeddingService, vectorStore);
await indexer.IndexAsync(items);

var engine = new SemanticSearchEngine(embeddingService, indexer.Collection);

Running the App

The CLI supports four flags: --query for a single search, --category to filter by category, --corpus to point at a custom JSON file, and --top to control result count. Without --query, the app enters interactive mode for multi-turn exploration.

dotnet run -- --query "how do I configure plugins"
# Top 5 results for: "how do I configure plugins"
#   [1] (0.912) [howto] How do I add a plugin to Semantic Kernel?
#        Create a class with methods decorated with [KernelFunction]...
#   [2] (0.874) [concept] What is FunctionChoiceBehavior in Semantic Kernel?
#        FunctionChoiceBehavior controls whether the LLM can automatically invoke...

Filtered search limits results to a specific category:

dotnet run -- --query "embeddings not working" --category troubleshooting
# Top 5 results for: "embeddings not working" (category: troubleshooting)
#   [1] (0.891) [troubleshooting] My embedding dimensions don't match...
#   [2] (0.823) [troubleshooting] Why is my Semantic Kernel plugin not being called?

Interactive mode is available for iterative exploration:

dotnet run
# Interactive mode -- type a search query and press Enter.

Search Quality Patterns

The quality of semantic search depends on embedding granularity and query framing. A few patterns that work well:

Embed at the right granularity. The corpus items in this app are paragraph-sized FAQ answers -- specific enough that each embedding represents a single concept. Large multi-topic items produce diffuse embeddings that rank weakly for specific queries. If your corpus has long items, apply the chunking strategies from the chunking strategies for RAG article before indexing.

Embed title and body together. Title-only embeddings miss the semantic content of the body. Body-only embeddings lose the navigational intent that titles carry. Concatenating both ("Title Body") produces embeddings that match both high-level queries ("SK plugins") and specific technical queries ("FunctionChoiceBehavior auto invoke").

Use category filtering for precision. When you know the user's intent category (they clicked a "troubleshooting" tab, for example), filtering by category before scoring eliminates off-topic high-scoring results. A troubleshooting query for "embedding errors" should not return "howto" articles even if they score well.

Frequently Asked Questions

What is the difference between semantic search and keyword search?

Keyword search matches exact or stemmed words. Semantic search matches meaning. "How do I handle errors?" and "exception management strategy" have no word overlap but are semantically close. Semantic search finds the latter from the former because both sentences encode to similar vector positions in the embedding space.

Can I use this without Azure OpenAI?

Yes. Set "Type": "openai" in EmbeddingAI, provide an ApiKey, and use a model like text-embedding-3-small. Remove the Endpoint field. AddOpenAITextEmbeddingGeneration works identically. The rest of the app is provider-agnostic.

When should I use InMemoryVectorStore vs. a persistent backend?

InMemoryVectorStore is for development and demos -- all data is lost on process exit. For production, swap it for Azure AI Search, Qdrant, or another SK connector. You re-index at startup if the corpus is small enough. For large corpora (thousands of items), use a persistent backend and index once. The Semantic Kernel vector store article covers the production options.

How does the lambda filter work in VectorSearchOptions in v10?

VectorSearchOptions<T>.Filter in Microsoft.Extensions.VectorData v10.0.0 is Expression<Func<T, bool>>. You assign a standard LINQ lambda: Filter = item => item.Category == "howto". This replaces the older VectorSearchFilter().EqualTo(field, value) API from earlier versions. It compiles at build time, and the vector store translates it to its native filter syntax (SQL predicate for relational backends, filter expression for Azure AI Search, etc.).

What does the similarity score mean and how should I interpret it?

The score is cosine similarity between the query embedding and the item embedding, ranging from 0 to 1. Values above 0.85 typically indicate strong topical overlap. Values between 0.6 and 0.85 indicate relevance on the same topic. Values below 0.5 indicate weak or incidental overlap. What counts as a "good" threshold depends on your corpus and use case -- display scores to users and let them adjust expectations, or add a minimum score threshold to filter weak results.

How is this different from building a document Q&A app with RAG?

This app returns ranked items from a pre-built corpus -- no generation. The Document Q&A app adds a second step: it injects retrieved chunks into an LLM prompt and generates a synthesized prose answer. Use semantic search when you want to surface existing records. Use RAG when you want the model to synthesize an answer from retrieved context. Both use the same underlying ITextEmbeddingGenerationService and VectorStoreCollection<K,V> -- the difference is what you do after retrieval.

Can I run the indexing once and reuse it across queries?

Yes -- and that's the standard production pattern. Index the corpus at startup (or once during a seeding step), keep the VectorStore alive for the process lifetime, and call SearchAsync on every query without re-indexing. With InMemoryVectorStore, the collection lives in memory between calls. With a persistent backend, skip EnsureCollectionExistsAsync and UpsertAsync if the collection already exists.

Clone It and Point It at Your Corpus

Get the full source from semantic-kernel-examples/ai-semantic-search. Copy appsettings.Development.json.example to appsettings.Development.json, fill in your embedding deployment, and run:

dotnet run -- --query "your search query"

To search your own knowledge base, create a JSON file matching the corpus format (Id, Title, Body, optional Category) and pass it via --corpus:

dotnet run -- --corpus ./my-knowledge-base.json --query "how to handle rate limits" --category troubleshooting

Building a semantic search engine with Semantic Kernel in C# requires fewer components than RAG -- just an embedding model, a vector store, and a search layer. That simplicity makes it a practical starting point for any internal search problem before you decide whether the project needs a full generation step on top.

Text Embeddings with Semantic Kernel in C#: A Practical Guide to ITextEmbeddingGenerationService

Learn to generate text embeddings with Semantic Kernel in C#. Covers ITextEmbeddingGenerationService, OpenAI, Azure OpenAI, and cosine similarity.

Build a Document Q&A App with RAG and Semantic Kernel in C#

Build a document Q&A app with RAG and Semantic Kernel in C#. Covers ITextEmbeddingGenerationService, InMemoryVectorStore, and vector similarity search.

Semantic Kernel Vector Store in C#: Azure AI Search, Qdrant, and Beyond

Master the Semantic Kernel vector store in C# with Azure AI Search, Qdrant, and InMemoryVectorStore for RAG and semantic search.

Table of Contents

Build a Semantic Search Engine with Semantic Kernel in C#

Semantic Search vs. RAG -- Knowing Which One You Need

Project Setup

Defining the Searchable Item

The Corpus Format

Batch Embedding the Corpus

Semantic Search with Scoring and Filtering

Wiring the App in Program.cs

Running the App

Search Quality Patterns

Frequently Asked Questions

What is the difference between semantic search and keyword search?

Can I use this without Azure OpenAI?

When should I use InMemoryVectorStore vs. a persistent backend?

How does the lambda filter work in VectorSearchOptions in v10?

What does the similarity score mean and how should I interpret it?

How is this different from building a document Q&A app with RAG?

Can I run the indexing once and reuse it across queries?

Clone It and Point It at Your Corpus

Want to read more? Check out these related blog posts!

Text Embeddings with Semantic Kernel in C#: A Practical Guide to ITextEmbeddingGenerationService

Build a Document Q&A App with RAG and Semantic Kernel in C#

Semantic Kernel Vector Store in C#: Azure AI Search, Qdrant, and Beyond