How C# Source Generators Work: The Roslyn Compilation Pipeline Explained
Understanding how C# source generators work is one of those things that fundamentally changes the way you think about compile-time code generation. You probably already know what source generators do -- they produce C# source files at compile time, before your application ever runs. But knowing how they plug into the Roslyn compilation pipeline -- and specifically how the IIncrementalGenerator incremental execution model operates -- separates developers who write reliable, performant generators from those who write generators that mysteriously bog down the IDE.
This article walks through the internals from the ground up: how Roslyn parses and binds your code, exactly where generators hook in, and the step-by-step execution model behind IIncrementalGenerator. By the end, you will understand the full pipeline and be able to reason confidently about generator design, performance, and IDE integration.
The Roslyn Compiler: A Quick Overview
Before you can understand how C# source generators work, you need a mental model of how the Roslyn compiler itself operates. Roslyn is not just a compiler -- it is a compiler-as-a-platform. It exposes its internal representation of your code as a rich, queryable API, and that API is the foundation source generators are built on.
The compilation pipeline has three major phases.
Parse. The compiler tokenizes and parses your source files into syntax trees. Each SyntaxTree is an immutable, hierarchical representation of a single source file. At this stage, the compiler does not know whether MyClass actually exists or whether a method call is valid -- it only understands the structure of the text. Every class declaration, method call, attribute, and expression is represented as a node in this tree.
Bind (Semantic Analysis). The compiler binds the parsed syntax trees against symbol tables and resolves types, members, namespaces, and overloads. The result is a Compilation object that holds the full semantic model. From this object, you can ask questions like "what type does this expression evaluate to?" or "does this class implement IDisposable?" This is the semantic layer -- the meaning of your code, not just its structure.
Emit. The compiler lowers the bound representation and emits IL (Intermediate Language) into an assembly. This is where the .dll or .exe gets written to disk.
What makes Roslyn unusual is that this entire pipeline is exposed to you via the Microsoft.CodeAnalysis.CSharp NuGet package. You can parse files, build compilations, and query semantic information from any .NET application. Source generators take advantage of this by running inside a live Roslyn compilation.
Where Source Generators Hook Into Roslyn
Source generators execute between the semantic analysis phase and the emit phase. More precisely, they run after the initial compilation is complete -- after all your existing source files have been parsed and bound -- but before the final assembly is emitted.
The Roslyn host (Visual Studio, the .NET SDK build, or Rider) invokes each registered generator and hands it the current Compilation object. The generator inspects that compilation, produces new SyntaxTree objects (generated source files), and hands them back to Roslyn. Roslyn then performs a second compilation pass that incorporates both the original source and the newly generated source, and that merged result is what gets emitted.
This two-pass model has an important consequence: generated code can reference your existing types, and your existing code can reference types that the generator produces -- all within a single logical compilation. The generated output feeds back into a unified result as if you had written those files yourself.
Thinking about this through the lens of the Builder design pattern can be helpful: Roslyn assembles the final compilation incrementally, step by step, and generators are contributors that slot in at a specific, well-defined point in that construction process.
The Two Models: ISourceGenerator vs IIncrementalGenerator
C# source generators were introduced in .NET 5 with the ISourceGenerator interface. The original model was straightforward: implement Initialize and Execute, receive a GeneratorExecutionContext, produce output. It was simple -- but had a critical flaw. The execution model gave generators no way to opt out of re-running when unrelated code changed. Every keystroke in the IDE could trigger a full generator re-execution, which caused severe performance problems in large solutions.
The IIncrementalGenerator interface was introduced in .NET 6 with Roslyn 4.x. It is the current standard and the only interface you should implement for new generator work. The older ISourceGenerator is considered legacy; Microsoft's own documentation explicitly recommends against it.
The key insight behind IIncrementalGenerator is that it separates what to observe from what to produce. Instead of receiving the full compilation and doing whatever you want with it, you declare a pipeline of data transformations that Roslyn can cache, diff, and short-circuit. If the inputs to a pipeline stage have not changed since the last run, Roslyn skips that stage entirely. This caching is what makes incremental generators safe to run hundreds of times per minute in an active IDE session.
How IIncrementalGenerator Executes
The execution model for IIncrementalGenerator centers on a single method: Initialize. Unlike the old model, there is no separate Execute call. Everything is wired up in Initialize through the IncrementalGeneratorInitializationContext parameter.
Here is the minimal structure of an IIncrementalGenerator-based source generator:
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp.Syntax;
namespace MyGenerators;
[Generator]
public sealed class MyGenerator : IIncrementalGenerator
{
public void Initialize(IncrementalGeneratorInitializationContext context)
{
// Declare pipelines here -- nothing executes yet.
// Roslyn invokes Initialize once, then uses the declared
// structure to determine what to (re)run on each compilation update.
}
}
This might look odd at first. You are not executing anything inside Initialize -- you are declaring pipelines. Roslyn invokes Initialize once and then uses the declared pipeline structure to decide what to re-run incrementally on every compilation update. The IncrementalGeneratorInitializationContext exposes several built-in providers:
context.SyntaxProvider-- creates providers driven by syntax tree changescontext.CompilationProvider-- a provider that yields the currentCompilationcontext.AdditionalTextsProvider-- a provider for non-C# files included in the projectcontext.AnalyzerConfigOptionsProvider-- a provider for MSBuild properties and.editorconfigsettingscontext.MetadataReferencesProvider-- a provider for referenced assemblies
The most common entry point for code generation is context.SyntaxProvider.
The IncrementalValueProvider Pipeline
IncrementalValueProvider<T> and IncrementalValuesProvider<T> are the backbone of the incremental model. A provider represents a value (or a collection of values) that evolves over time as the compilation changes. You build pipelines by chaining transformations on these providers, and Roslyn handles the caching and invalidation automatically.
The two most important factory methods on SyntaxProvider are:
CreateSyntaxProvider. Takes a predicate (a fast, cheap filter run on every syntax node change) and a transform (a richer extraction using the GeneratorSyntaxContext, which includes the SemanticModel). The predicate runs first and should be as lightweight as possible -- Roslyn calls it extremely frequently, sometimes for every node in every changed file.
ForAttributeWithMetadataName. A higher-level, heavily optimized API introduced in Roslyn 4.3 / .NET 7 SDK (Visual Studio 2022 17.3+). It filters syntax nodes decorated with a specific attribute, identified by its fully qualified metadata name. This is the preferred approach for attribute-driven generators on Roslyn 4.3 / .NET 7 SDK+ because Roslyn can skip the semantic lookup entirely for files that do not contain the attribute in question. If you are using the .NET 8+ SDK this is always available.
Here is how you build a provider from a custom attribute and wire it to source output:
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp.Syntax;
namespace MyGenerators;
[Generator]
public sealed class AutoNotifyGenerator : IIncrementalGenerator
{
public void Initialize(IncrementalGeneratorInitializationContext context)
{
// Step 1: declare what syntax nodes to watch
IncrementalValuesProvider<ClassDeclarationSyntax?> classProvider =
context.SyntaxProvider
.ForAttributeWithMetadataName(
fullyQualifiedMetadataName: "MyApp.AutoNotifyAttribute",
predicate: static (node, _) => node is ClassDeclarationSyntax,
transform: static (ctx, _) => ctx.TargetNode as ClassDeclarationSyntax)
.Where(static node => node is not null);
// Step 2: register what to generate when inputs arrive
context.RegisterSourceOutput(classProvider, static (spc, classNode) =>
{
if (classNode is null) return;
var source = $$"""
namespace Generated;
partial class {{classNode.Identifier.Text}}
{
// Auto-generated property notification support
}
""";
spc.AddSource($"{classNode.Identifier.Text}.g.cs", source);
});
}
}
The Where call is a filtering transformation -- it drops values that do not meet the condition, which means the downstream RegisterSourceOutput never fires for them. Keeping your pipeline narrow by filtering early is the single most impactful thing you can do for generator performance.
Combining Providers for Semantic Analysis
Real generators rarely need only syntax information. More often you need the Compilation object itself -- to resolve INamedTypeSymbol for a type, check implemented interfaces, or read attribute constructor arguments. This requires combining the syntax provider with the compilation provider.
The Combine method merges two providers, pairing each value from the first with the single value of the second:
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp.Syntax;
namespace MyGenerators;
[Generator]
public sealed class FactoryGenerator : IIncrementalGenerator
{
public void Initialize(IncrementalGeneratorInitializationContext context)
{
IncrementalValuesProvider<ClassDeclarationSyntax> classProvider =
context.SyntaxProvider
.ForAttributeWithMetadataName(
"MyApp.GenerateFactoryAttribute",
predicate: static (node, _) => node is ClassDeclarationSyntax,
transform: static (ctx, _) => (ClassDeclarationSyntax)ctx.TargetNode);
// Pair each class node with the full compilation snapshot
var combined = classProvider.Combine(context.CompilationProvider);
context.RegisterSourceOutput(combined, static (spc, pair) =>
{
var (classDecl, compilation) = pair;
var semanticModel = compilation.GetSemanticModel(classDecl.SyntaxTree);
var symbol = semanticModel.GetDeclaredSymbol(classDecl);
if (symbol is null) return;
var ns = symbol.ContainingNamespace?.IsGlobalNamespace == true
? null
: symbol.ContainingNamespace?.ToDisplayString();
var source = ns is null
? $$"""
public static partial class {{symbol.Name}}Factory
{
public static {{symbol.Name}} Create() => new {{symbol.Name}}();
}
"""
: $$"""
namespace {{ns}};
public static partial class {{symbol.Name}}Factory
{
public static {{symbol.Name}} Create() => new {{symbol.Name}}();
}
""";
spc.AddSource($"{symbol.Name}Factory.g.cs", source);
});
}
}
This pattern -- combining syntax discovery with full semantic analysis -- is exactly how generators implement real-world use cases like automated factory method generation. The generator finds candidate types at the syntax level cheaply, then enriches them with semantic information only when it needs to. Notice the use of raw string literals with $$"""...""" interpolation -- a C# 11+ feature that makes multiline code generation significantly cleaner than manual string concatenation.
What the Generator Receives
Inside RegisterSourceOutput, the SourceProductionContext is your handle to the compilation output. It exposes several key members:
AddSource(string hintName, string source)-- emits a new source file into the compilation. ThehintNameis the logical file name and must be unique within a single generator invocation.AddSource(string hintName, SourceText sourceText)-- the overload that accepts aSourceText, which allows you to specify encoding. This matters in certain toolchain configurations where encoding must match.ReportDiagnostic(Diagnostic diagnostic)-- emits a compiler warning or error back to the developer. Generators can validate attribute usage and fail the build with descriptive messages.CancellationToken-- always respect this token inside transformations. The IDE cancels generator runs frequently as the user types.
The Compilation object available through CompilationProvider is a full snapshot of the current compilation state. You can traverse all SyntaxTree objects in compilation.SyntaxTrees, resolve symbols with compilation.GetTypeByMetadataName(...), inspect referenced assemblies, and get a SemanticModel for any syntax tree. The SemanticModel is what lets you answer semantic questions -- the fully qualified name of a type reference, whether a class implements a specific interface, or what an attribute's constructor arguments resolve to.
Understanding how C# source generators work at this level -- the Compilation, SemanticModel, and SyntaxTree triad -- is essential for generators that need accurate type resolution rather than simple text pattern matching.
What the Generator Outputs
Every call to spc.AddSource(hintName, source) produces a new source file that Roslyn incorporates into the compilation. A few conventions govern how you should structure this output.
Naming conventions. Use a descriptive hintName that includes the type name and ends in .g.cs. The .g.cs convention signals to IDEs, analyzers, and developers that the file is generated. For example: MyClass.g.cs, MyClassFactory.g.cs, or MyNamespace_MyClass.g.cs. Duplicate hint names within a single generator will throw an exception.
Partial classes. The standard pattern is to generate a partial class or interface that complements the user-authored partial. This lets the generator add members without touching the developer's own code. Source generators can automate a wide variety of patterns this way -- generating decorator wrappers (see the decorator pattern guide), enforcing singleton guarantees at compile time, and even powering plugin discovery registration without any runtime reflection.
Generated output location. During SDK builds, Roslyn writes the generated files into the obj/ folder, under a path like obj/Debug/net10.0/generated/<GeneratorAssemblyName>/<GeneratorClassName>/. You can set <EmitCompilerGeneratedFiles>true</EmitCompilerGeneratedFiles> in your .csproj to have them written to a configurable stable path for easy inspection.
The same principle applies to generated decorator wrappers for cross-cutting concerns: the generator can produce the entire wrapping infrastructure from a simple attribute, leaving the developer's interface definition intact and the boilerplate entirely in the generated layer.
How the IDE Integrates
The IDE experience for source generators differs subtly from the SDK build-time execution, and understanding that difference helps explain some otherwise puzzling behavior.
In Visual Studio and Rider, source generators run continuously as you type -- a process called design-time generation. The generated files appear under the Analyzers node in Solution Explorer, nested inside the project's analyzer tree. They are read-only from the IDE's perspective: you cannot edit them directly, nor should you need to.
Design-time generation uses a background Roslyn workspace that maintains the current compilation state and re-runs generator pipelines as edits arrive. Because it runs on every change, generators that are slow -- or that access the file system, network resources, or environment variables -- cause noticeable editor lag. The incremental model exists precisely to mitigate this by caching pipeline stages and only re-executing what has actually changed.
Hot reload (.NET hot reload / dotnet watch) does not re-run generators. Source generators execute only during full compilations or explicit rebuilds, not during hot reload patches. This is worth knowing when you expect a generator change to appear immediately during a dotnet watch session -- it will not, until you trigger a rebuild.
Common Pitfalls in Understanding the Pipeline
Several misconceptions repeatedly trip up developers who are learning how source generators work in C#.
"I can use the file system or environment variables inside my generator." You technically can call File.ReadAllText inside a pipeline callback, but you absolutely should not. Accessing the file system bypasses Roslyn's caching model, produces non-deterministic output depending on machine state, and degrades IDE performance significantly. Route all external data through the Roslyn pipeline -- use AdditionalTextsProvider for non-C# files and AnalyzerConfigOptionsProvider for MSBuild properties.
"The generator runs once per build." In an active IDE session, your generator may run hundreds of times per minute. Any expensive work inside your transformation pipeline -- complex LINQ queries, deep recursion, large string allocations -- is multiplied by that frequency. Use ForAttributeWithMetadataName where possible, implement structural equality on your data models so Roslyn's caching can detect unchanged inputs, and keep predicate callbacks trivially fast (a single is type check is ideal).
"I can mutate or replace existing source files." Generators are strictly additive. They can only add new SyntaxTree objects to the compilation; they cannot modify or remove existing ones. This is by design -- it preserves the integrity of the developer's own code and makes the generator model safe to reason about.
"Diagnostics are just warnings." Calling spc.ReportDiagnostic(...) with DiagnosticSeverity.Error will fail the build -- exactly like a compiler error. This is a feature: generators can enforce usage rules on their own attributes and give developers precise, actionable error messages rather than cryptic "type not found" failures.
"ISourceGenerator is fine for small generators." It is not. The absence of caching in the old model means even a "small" generator causes unnecessary IDE slowdowns. The incremental model has negligible overhead when inputs do not change. There is no reason to use ISourceGenerator for any new work.
FAQ
What is the difference between ISourceGenerator and IIncrementalGenerator?
ISourceGenerator is the original, deprecated API from .NET 5. It exposes a single Execute method that receives the full compilation and runs on every compilation pass -- including every IDE keystroke. IIncrementalGenerator (introduced in .NET 6 with Roslyn 4.x) requires you to declare a pipeline of value providers and transformations upfront. Roslyn caches the output of each pipeline stage and only re-executes stages where the inputs have actually changed. For any new generator work targeting .NET 6 / Roslyn 4.0+, always use IIncrementalGenerator exclusively.
When does a source generator re-run in the IDE?
Roslyn re-evaluates pipeline stages when their declared inputs change. A SyntaxProvider stage re-runs when the syntax trees it observes change. A CompilationProvider stage re-runs when the full compilation changes (triggered by any edit). ForAttributeWithMetadataName is particularly optimized -- Roslyn can skip it entirely for edits that occur in files not containing the tracked attribute. The practical takeaway: minimize what your generator observes and use the most targeted provider API available to maximize the amount of work Roslyn can skip.
Can a source generator read from external files or databases?
Source generators should only consume inputs that flow through the Roslyn pipeline: AdditionalTextsProvider for non-C# files listed in the .csproj, and AnalyzerConfigOptionsProvider for MSBuild properties and .editorconfig settings. Accessing the file system directly inside a generator bypasses Roslyn's caching, produces non-deterministic output that changes between machines, and actively degrades IDE responsiveness. Route all external data through MSBuild and the appropriate Roslyn provider.
How does a source generator produce output?
You call spc.AddSource(hintName, sourceCode) inside a RegisterSourceOutput callback. The hintName is a unique logical filename (by convention ending in .g.cs). The sourceCode is a plain C# string or SourceText object. Roslyn adds the resulting SyntaxTree to the compilation as if you had written that file yourself. The generated code participates fully in the final compilation -- it can define types your hand-written code depends on, and your hand-written code can call methods the generator produced.
Why does my generated code not appear in the IDE?
Several things can prevent generated files from surfacing. Check that your generator project reference uses OutputItemType="Analyzer" ReferenceOutputAssembly="false". Verify your generator class has the [Generator] attribute. Confirm the Roslyn version your generator targets supports the APIs you are using -- ForAttributeWithMetadataName requires Roslyn 4.3+. Look in the Analyzers node in Solution Explorer; if the generator loaded but produces no output, your pipeline predicates are likely filtering out all candidate nodes. Adding a ReportDiagnostic call inside the pipeline can help confirm whether execution is reaching your callback.
Can source generators use async code?
No. The Roslyn incremental pipeline is entirely synchronous. There is no async/await support in generator methods or pipeline callbacks. If you need data that would naturally be retrieved asynchronously -- reading from an API, resolving external metadata -- you must route that data through the pipeline synchronously using MSBuild tasks to pre-fetch it and pass it as additional files or properties. Long-running synchronous work inside generators is actively harmful to IDE performance, so keep all callbacks fast and deterministic.
Do source generators run during dotnet test?
Yes. Source generators execute as part of any dotnet build invocation, and dotnet test triggers an implicit build. This means your test project sees all generated code from generators referenced in its dependency chain. If a generator produces helpers, stubs, or fixtures, you can reference that generator from the test project and the generated code will be available to your tests without any special configuration -- useful for generating test doubles or registration boilerplate automatically.
Conclusion
Understanding how C# source generators work -- from the Roslyn parse, bind, and emit pipeline through the two-pass compilation model to the incremental value provider framework -- gives you the foundation to write generators that are correct, performant, and IDE-friendly.
The core principle of IIncrementalGenerator is that it separates declaration from execution. You declare pipelines in Initialize; Roslyn decides when to run them and what to cache. The more precisely you scope your pipeline inputs -- using ForAttributeWithMetadataName, Where filters, and targeted Combine calls -- the more Roslyn can short-circuit, and the better your generator behaves in an active development session.
From automating factory boilerplate to generating decorator wrappers, source generators unlock a class of solutions that would otherwise require error-prone manual code or slow runtime reflection. The compilation pipeline is the mechanism -- understanding it well is what lets you use that mechanism confidently.
If you are looking at design patterns that generators can automate, the guides on factory methods and decorator patterns are solid next steps for seeing generators applied to real structural problems.

