Build a Semantic Search Engine with Semantic Kernel in C#
Keyword search can't find "show me how to configure authentication" when the answer is titled "Setting Up JWT Token Validation." The words don't match, but the meaning does. Building a semantic search engine with Semantic Kernel in C# solves this by encoding both queries and corpus items as vectors and ranking results by meaning-distance -- no keyword overlap required.
This article walks through a working .NET 9 console app that loads a structured JSON corpus, batch-embeds each item using ITextEmbeddingGenerationService, stores the embeddings in an InMemoryVectorStore, and performs similarity search with optional category filtering via VectorSearchOptions<T>. Every class is shown with real code from the working project.
The complete source is in semantic-kernel-examples/ai-semantic-search. The app runs against the included sample corpus of 20 FAQ items or any JSON corpus you point it at.
Semantic Search vs. RAG -- Knowing Which One You Need
These two patterns are often conflated but solve different problems. RAG with Semantic Kernel in C# retrieves context and then calls an LLM to generate a synthesized answer from that context. A semantic search engine retrieves and ranks -- it returns the best-matching items from your corpus, scored, without synthesizing anything new.
Semantic search is the right choice when:
- You want to return existing records (FAQ items, documentation pages, products, tickets)
- Scores and ranking matter to your UI
- You need category or metadata filtering alongside the vector search
- You do not need the model to generate prose -- the corpus items are the answer
For this app, no chat completion model is configured at all. The only AI service is ITextEmbeddingGenerationService. This is intentional -- it shows the vector store layer independently of the full RAG pipeline.
Project Setup
The app uses the same package set as the Document Q&A app:
<PackageReference Include="Microsoft.SemanticKernel" Version="1.72.0" />
<PackageReference Include="Microsoft.SemanticKernel.Connectors.InMemory" Version="1.72.0-preview" />
<PackageReference Include="Microsoft.Extensions.Configuration.Json" Version="9.0.2" />
<PackageReference Include="Microsoft.Extensions.Configuration.Binder" Version="9.0.2" />
<PackageReference Include="Microsoft.Extensions.VectorData.Abstractions" Version="10.0.0" />
Configuration requires only the EmbeddingAI section -- no ChatAI:
{
"EmbeddingAI": {
"Type": "azureopenai",
"ModelId": "text-embedding-ada-002",
"Endpoint": "https://your-resource.openai.azure.com/",
"ApiKey": ""
}
}
The text embeddings with Semantic Kernel in C# article covers the difference between text-embedding-ada-002, text-embedding-3-small, and text-embedding-3-large and when to choose each.
Defining the Searchable Item
SearchItem is the vector store record. Each item has an ID, title, body text, an optional category string, and the embedding vector:
using Microsoft.Extensions.VectorData;
public sealed class SearchItem
{
[VectorStoreKey]
public string Id { get; set; } = "";
[VectorStoreData]
public string Title { get; set; } = "";
[VectorStoreData]
public string Body { get; set; } = "";
// Optional category for filtered search (e.g., "concept", "howto", "troubleshooting")
[VectorStoreData]
public string Category { get; set; } = "";
// 1536 dimensions: compatible with text-embedding-ada-002 and text-embedding-3-small
[VectorStoreVector(1536, DistanceFunction = DistanceFunction.CosineSimilarity)]
public ReadOnlyMemory<float> Embedding { get; set; }
}
One change from earlier SK versions: IsFilterable was removed as a property of [VectorStoreData] in Microsoft.Extensions.VectorData v10.0.0. In v10, filtering is expressed via Expression<Func<T, bool>> directly in VectorSearchOptions<T> -- the attribute does not need to opt in. The Semantic Kernel vector store article covers the full v10 attribute changes in context.
The Corpus Format
The sample corpus is a JSON array. Any structured knowledge base can be converted to this format:
[
{
"Id": "sk-001",
"Title": "What is Semantic Kernel?",
"Body": "Semantic Kernel is an open-source SDK from Microsoft...",
"Category": "concept"
},
{
"Id": "sk-006",
"Title": "Why is my Semantic Kernel plugin not being called?",
"Body": "Common causes: FunctionChoiceBehavior is set to None...",
"Category": "troubleshooting"
}
]
The sample corpus has 20 items across two topic domains (Semantic Kernel and dependency injection) and three categories (concept, howto, troubleshooting). The categories enable filtered search -- you can restrict results to troubleshooting articles only, for example.
Batch Embedding the Corpus
CorpusIndexer embeds all items in a single batch API call and stores them in the vector collection:
public sealed class CorpusIndexer
{
private const string CollectionName = "corpus";
private readonly ITextEmbeddingGenerationService _embeddingService;
private readonly VectorStoreCollection<string, SearchItem> _collection;
public CorpusIndexer(
ITextEmbeddingGenerationService embeddingService,
VectorStore vectorStore)
{
_embeddingService = embeddingService;
_collection = vectorStore.GetCollection<string, SearchItem>(CollectionName);
}
public async Task IndexAsync(
IEnumerable<SearchItem> items,
CancellationToken cancellationToken = default)
{
await _collection.EnsureCollectionExistsAsync(cancellationToken);
var itemList = items.ToList();
// Embed title + body together for richer semantic representation
var texts = itemList.Select(item => $"{item.Title}
{item.Body}").ToList();
var embeddings = await _embeddingService
.GenerateEmbeddingsAsync(texts, cancellationToken: cancellationToken);
for (int i = 0; i < itemList.Count; i++)
{
itemList[i].Embedding = embeddings[i];
await _collection.UpsertAsync(itemList[i], cancellationToken: cancellationToken);
}
}
}
Embedding "Title Body" instead of just the body gives the embedding model the full context of each item. A title like "Why is my plugin not being called?" carries more diagnostic signal than the body alone. The batch call to GenerateEmbeddingsAsync sends all texts in one request -- efficient for startup indexing whether the corpus has 20 items or 2000.
VectorStore.GetCollection<string, SearchItem>(CollectionName) resolves to the corpus collection. When you swap InMemoryVectorStore for Azure AI Search in production, this line doesn't change -- only the DI registration changes.
Semantic Search with Scoring and Filtering
SemanticSearchEngine embeds the query and returns top-k results, optionally filtered by category:
public sealed class SemanticSearchEngine
{
private readonly ITextEmbeddingGenerationService _embeddingService;
private readonly VectorStoreCollection<string, SearchItem> _collection;
public async Task<IReadOnlyList<SearchResult>> SearchAsync(
string query,
int topK = 5,
string? categoryFilter = null,
CancellationToken cancellationToken = default)
{
var queryEmbeddings = await _embeddingService
.GenerateEmbeddingsAsync([query], cancellationToken: cancellationToken);
var queryVector = queryEmbeddings[0];
// v10 API: Filter is Expression<Func<T, bool>> -- a lambda expression
VectorSearchOptions<SearchItem>? options = null;
if (!string.IsNullOrWhiteSpace(categoryFilter))
{
var cat = categoryFilter;
options = new VectorSearchOptions<SearchItem>
{
Filter = item => item.Category == cat
};
}
var searchable = (IVectorSearchable<SearchItem>)_collection;
var results = searchable.SearchAsync(queryVector, topK, options, cancellationToken);
var output = new List<SearchResult>();
await foreach (var result in results.WithCancellation(cancellationToken))
output.Add(new SearchResult(result.Record, result.Score ?? 0.0));
return output;
}
}
The VectorSearchOptions<T>.Filter property in Microsoft.Extensions.VectorData v10.0.0 is Expression<Func<T, bool>> -- a standard LINQ lambda expression. This is cleaner than the older VectorSearchFilter().EqualTo(...) fluent syntax: it's type-safe, checked by the compiler, and works with IntelliSense. Passing null for options skips filtering entirely.
result.Score is double? (nullable) in v10 -- use ?? 0.0 as a safe default. Scores are cosine similarity values from 0 (unrelated) to 1 (identical meaning). In practice, scores above 0.85 indicate strong semantic overlap; 0.6–0.85 is related; below 0.6 is weak relevance.
Wiring the App in Program.cs
Kernel setup for an embedding-only app is simpler than the full RAG case -- no chat service is registered. The key difference is that GetRequiredService<ITextEmbeddingGenerationService>() is the only AI service call at runtime. Everything else is vector store plumbing.
var builder = Kernel.CreateBuilder();
// Only embedding service needed -- no chat completion
builder.AddAzureOpenAITextEmbeddingGeneration(
deploymentName: embeddingConfig.ModelId,
endpoint: embeddingConfig.Endpoint,
apiKey: embeddingConfig.ApiKey);
builder.Services.AddSingleton<VectorStore, InMemoryVectorStore>();
var kernel = builder.Build();
var embeddingService = kernel.GetRequiredService<ITextEmbeddingGenerationService>();
var vectorStore = kernel.Services.GetRequiredService<VectorStore>();
var indexer = new CorpusIndexer(embeddingService, vectorStore);
await indexer.IndexAsync(items);
var engine = new SemanticSearchEngine(embeddingService, indexer.Collection);
Running the App
The CLI supports four flags: --query for a single search, --category to filter by category, --corpus to point at a custom JSON file, and --top to control result count. Without --query, the app enters interactive mode for multi-turn exploration.
dotnet run -- --query "how do I configure plugins"
# Top 5 results for: "how do I configure plugins"
# [1] (0.912) [howto] How do I add a plugin to Semantic Kernel?
# Create a class with methods decorated with [KernelFunction]...
# [2] (0.874) [concept] What is FunctionChoiceBehavior in Semantic Kernel?
# FunctionChoiceBehavior controls whether the LLM can automatically invoke...
Filtered search limits results to a specific category:
dotnet run -- --query "embeddings not working" --category troubleshooting
# Top 5 results for: "embeddings not working" (category: troubleshooting)
# [1] (0.891) [troubleshooting] My embedding dimensions don't match...
# [2] (0.823) [troubleshooting] Why is my Semantic Kernel plugin not being called?
Interactive mode is available for iterative exploration:
dotnet run
# Interactive mode -- type a search query and press Enter.
Search Quality Patterns
The quality of semantic search depends on embedding granularity and query framing. A few patterns that work well:
Embed at the right granularity. The corpus items in this app are paragraph-sized FAQ answers -- specific enough that each embedding represents a single concept. Large multi-topic items produce diffuse embeddings that rank weakly for specific queries. If your corpus has long items, apply the chunking strategies from the chunking strategies for RAG article before indexing.
Embed title and body together. Title-only embeddings miss the semantic content of the body. Body-only embeddings lose the navigational intent that titles carry. Concatenating both ("Title Body") produces embeddings that match both high-level queries ("SK plugins") and specific technical queries ("FunctionChoiceBehavior auto invoke").
Use category filtering for precision. When you know the user's intent category (they clicked a "troubleshooting" tab, for example), filtering by category before scoring eliminates off-topic high-scoring results. A troubleshooting query for "embedding errors" should not return "howto" articles even if they score well.
Frequently Asked Questions
What is the difference between semantic search and keyword search?
Keyword search matches exact or stemmed words. Semantic search matches meaning. "How do I handle errors?" and "exception management strategy" have no word overlap but are semantically close. Semantic search finds the latter from the former because both sentences encode to similar vector positions in the embedding space.
Can I use this without Azure OpenAI?
Yes. Set "Type": "openai" in EmbeddingAI, provide an ApiKey, and use a model like text-embedding-3-small. Remove the Endpoint field. AddOpenAITextEmbeddingGeneration works identically. The rest of the app is provider-agnostic.
When should I use InMemoryVectorStore vs. a persistent backend?
InMemoryVectorStore is for development and demos -- all data is lost on process exit. For production, swap it for Azure AI Search, Qdrant, or another SK connector. You re-index at startup if the corpus is small enough. For large corpora (thousands of items), use a persistent backend and index once. The Semantic Kernel vector store article covers the production options.
How does the lambda filter work in VectorSearchOptions in v10?
VectorSearchOptions<T>.Filter in Microsoft.Extensions.VectorData v10.0.0 is Expression<Func<T, bool>>. You assign a standard LINQ lambda: Filter = item => item.Category == "howto". This replaces the older VectorSearchFilter().EqualTo(field, value) API from earlier versions. It compiles at build time, and the vector store translates it to its native filter syntax (SQL predicate for relational backends, filter expression for Azure AI Search, etc.).
What does the similarity score mean and how should I interpret it?
The score is cosine similarity between the query embedding and the item embedding, ranging from 0 to 1. Values above 0.85 typically indicate strong topical overlap. Values between 0.6 and 0.85 indicate relevance on the same topic. Values below 0.5 indicate weak or incidental overlap. What counts as a "good" threshold depends on your corpus and use case -- display scores to users and let them adjust expectations, or add a minimum score threshold to filter weak results.
How is this different from building a document Q&A app with RAG?
This app returns ranked items from a pre-built corpus -- no generation. The Document Q&A app adds a second step: it injects retrieved chunks into an LLM prompt and generates a synthesized prose answer. Use semantic search when you want to surface existing records. Use RAG when you want the model to synthesize an answer from retrieved context. Both use the same underlying ITextEmbeddingGenerationService and VectorStoreCollection<K,V> -- the difference is what you do after retrieval.
Can I run the indexing once and reuse it across queries?
Yes -- and that's the standard production pattern. Index the corpus at startup (or once during a seeding step), keep the VectorStore alive for the process lifetime, and call SearchAsync on every query without re-indexing. With InMemoryVectorStore, the collection lives in memory between calls. With a persistent backend, skip EnsureCollectionExistsAsync and UpsertAsync if the collection already exists.
Clone It and Point It at Your Corpus
Get the full source from semantic-kernel-examples/ai-semantic-search. Copy appsettings.Development.json.example to appsettings.Development.json, fill in your embedding deployment, and run:
dotnet run -- --query "your search query"
To search your own knowledge base, create a JSON file matching the corpus format (Id, Title, Body, optional Category) and pass it via --corpus:
dotnet run -- --corpus ./my-knowledge-base.json --query "how to handle rate limits" --category troubleshooting
Building a semantic search engine with Semantic Kernel in C# requires fewer components than RAG -- just an embedding model, a vector store, and a search layer. That simplicity makes it a practical starting point for any internal search problem before you decide whether the project needs a full generation step on top.

