BrandGhost
C# Regex Replace and Split: MatchEvaluator, EnumerateSplits, and Substitutions

C# Regex Replace and Split: MatchEvaluator, EnumerateSplits, and Substitutions

C# Regex Replace and Split: MatchEvaluator, EnumerateSplits, and Substitutions

C# regex replace and split are the workhorses of text transformation. Whether you're sanitizing user input, reformatting dates, tokenizing configuration files, or processing log streams, Regex.Replace and Regex.Split are indispensable. And in .NET 7 and .NET 8, they gained Span-based zero-allocation counterparts that change how you write performance-critical text processing.

This article covers the full API -- from basic string replacement to MatchEvaluator delegates, substitution syntax, capturing groups in replacements, and the new EnumerateSplits API in .NET 8.


Regex.Replace -- The Basics

Regex.Replace is the primary tool for pattern-based text transformation in C#. You supply a pattern to match and a replacement string (or a callback), and the engine finds every match in the input and substitutes it. Unlike string.Replace, the pattern can be as complex as your transformation requires -- character classes, quantifiers, capturing groups, all are available. The method returns a new string; the original is not modified.

The simplest form replaces every match with a literal string:

using System.Text.RegularExpressions;

var regex = new Regex(@"d{4}");
string result = regex.Replace("Card: 4242 4242 4242 4242", "****");
Console.WriteLine(result); // Card: **** **** **** ****

You can also call it statically for one-off replacements:

string result = Regex.Replace("Hello   World", @"s+", " ");
Console.WriteLine(result); // Hello World

Limiting the Number of Replacements

When you only want to replace the first N occurrences, pass a count argument. This is useful when the same pattern appears multiple times in the input but you only want to modify the early occurrences -- for example, redacting only the first occurrence of a sensitive token while leaving the rest intact for context:

var regex = new Regex(@"foo", RegexOptions.IgnoreCase);
string result = regex.Replace("foo FOO foo", "bar", count: 2);
Console.WriteLine(result); // bar bar foo

Starting at a Specific Position

The startat parameter lets you skip the first portion of the input and start matching from a given character offset. This is useful when you've already processed a header or prefix and want replacement to apply only to the body of the text. Combined with count, you get precise surgical control over which matches are replaced:

var regex = new Regex(@"d+");
string result = regex.Replace("1 2 3 4 5", "X", count: 2, startat: 2);
Console.WriteLine(result); // 1 X X 4 5

Substitution Syntax in Replacement Strings

Regex replacement strings have their own mini-language for referencing captured content. Instead of writing callback code, you can use substitution tokens -- short syntax strings -- directly in the replacement parameter. This is the fastest approach for structural reformatting (like date format conversion or wrapping matches in HTML tags) because no MatchEvaluator delegate is invoked. The full set of substitution tokens supported by .NET is:

Substitution Meaning
$1, $2 Content of group 1, group 2
${name} Content of named group
$0 or $& Entire matched text
$` Text before the match
$' Text after the match
$+ Content of last captured group
$$ Literal $

Reformatting Dates with Backreferences

Named backreferences make date reformatting patterns readable and self-documenting. Instead of using $1, $2, $3 (which forces you to remember the group order), you use ${year}, ${month}, and ${day} -- making the replacement string read almost like a template. Here's a common use case: converting ISO date format to a localized display format: var regex = new Regex(@"(?d{4})-(?d{2})-(?d{2})"); string result = regex.Replace("Date: 2026-05-09", "\({day}/\)/$"); Console.WriteLine(result); // Date: 09/05/2026


### Wrapping Matches with the Full Match Reference

The `$0` (or `$&`) substitution token inserts the entire matched text into the replacement. This is useful for annotation tasks: you want to keep every match but wrap it in markup, add punctuation, or surround it with delimiters. No capture groups are needed because `$0` always refers to the full match regardless of what groups are defined:
var regex = new Regex(@"d+");
string result = regex.Replace("Score: 42, Bonus: 100", "<b>$0</b>");
Console.WriteLine(result); // Score: <b>42</b>, Bonus: <b>100</b>

MatchEvaluator -- Dynamic Replacement Logic

The MatchEvaluator delegate (Func<Match, string>) gives you complete control over what each match is replaced with. This is where Regex.Replace becomes really powerful:

var regex = new Regex(@"d+");

string result = regex.Replace(
    "The price is 100 and tax is 15",
    match =>
    {
        int value = int.Parse(match.Value);
        return $"${value * 1.20:F2}";
    });

Console.WriteLine(result);
// The price is $120.00 and tax is $18.00

The lambda receives the Match object, giving you access to groups, index, length, and everything else:

var namePattern = new Regex(@"(?<first>w+)s+(?<last>w+)");

string result = namePattern.Replace(
    "John Smith and Jane Doe",
    match =>
    {
        var first = match.Groups["first"].Value;
        var last  = match.Groups["last"].Value;
        return $"{last}, {first}";
    });

Console.WriteLine(result); // Smith, John and Doe, Jane

MatchEvaluator for Conditional Replacement

The MatchEvaluator approach is the right choice when the replacement depends on the content of the match itself. Instead of a fixed string, you provide a Func<Match, string> that receives each Match object and returns the replacement. This lets you make decisions based on captured group values, match position, or any other runtime data. A common use case is conditional transformation -- replace only if the match meets a secondary condition: var regex = new Regex(@"(?[a-zA-Z]+)"); var reserved = new HashSet(StringComparer.OrdinalIgnoreCase) { "class", "public", "private", "static", "void", "return" };

string highlighted = regex.Replace( "public static void Main()", match ⇒ { var word = match.Groups["word"].Value; return reserved.Contains(word) ? $"" : word; });

Console.WriteLine(highlighted); // public static void Main()


---

## Using [GeneratedRegex] with Replace

For production code, `[GeneratedRegex]` generates the regex at compile time and provides the best possible runtime performance. It works with all `Replace` overloads including the `MatchEvaluator` form, so you don't have to choose between performance and dynamic replacement logic. The attribute must be applied to a `static partial` method in a `partial` class:

```csharp
using System.Text.RegularExpressions;

public partial class TextProcessor
{
    [GeneratedRegex(
        @"(?<number>d+(?:.d+)?)",
        RegexOptions.None,
        matchTimeoutMilliseconds: 500)]
    private static partial Regex NumberPattern();

    public static string NormalizeNumbers(string input)
    {
        return NumberPattern().Replace(input, match =>
        {
            if (!double.TryParse(match.Groups["number"].Value, out double val))
                return match.Value;
            return val.ToString("N2");
        });
    }
}

Regex.Split -- Dividing Text by Pattern

Regex.Split divides a string wherever the pattern matches, returning a string array. Unlike string.Split, the delimiter can be a complex pattern:

var regex = new Regex(@"[,;s]+");
string[] tokens = regex.Split("alpha, beta;  gamma	delta");

foreach (var t in tokens)
{
    Console.WriteLine(t);
}
// alpha
// beta
// gamma
// delta

Limiting Split Results

When you only need the first N segments, pass a count argument to avoid splitting the entire string. The last segment will contain everything after the Nth delimiter -- the remainder of the input is left unsplit. This is useful for header/body parsing, where you want to extract the first few fields and treat everything after as a single blob:

var regex = new Regex(@"s+");
string[] parts = regex.Split("one two three four five", count: 3);

Console.WriteLine(parts[0]); // one
Console.WriteLine(parts[1]); // two
Console.WriteLine(parts[2]); // three four five

Including Captured Groups in Split Results

When the split pattern contains capturing groups, the captured content is interleaved with the segments. This is a useful way to tokenize while keeping delimiters:

// Capture the separator type
var regex = new Regex(@"([,;])");
string[] tokens = regex.Split("a,b;c,d");

foreach (var t in tokens)
{
    Console.WriteLine($"'{t}'");
}
// 'a'
// ','
// 'b'
// ';'
// 'c'
// ','
// 'd'

This is how you tokenize configuration strings where separator type matters.


EnumerateSplits (.NET 8+) -- Zero-Allocation Splitting

Regex.EnumerateSplits is the .NET 8 counterpart to Matches.EnumerateMatches. It returns a SplitEnumerator that yields Range values pointing into the original string -- no string array, no substring allocations:

// .NET 8+ only
var regex = new Regex(@"[,;]+");
var input = "alpha,beta;gamma,delta";

foreach (var range in regex.EnumerateSplits(input))
{
    Console.WriteLine(input[range]);
}
// alpha
// beta
// gamma
// delta

Compare the allocations:

var regex = new Regex(@"s+");
string input = GetLargeString(); // imagine 10MB of text

// Old approach: allocates a string[] + N substring objects
string[] parts = regex.Split(input);

// New approach (.NET 8): allocates nothing beyond the enumerator struct
foreach (var range in regex.EnumerateSplits(input))
{
    ProcessSegment(input.AsSpan(range));
}

The EnumerateSplits approach is ideal for stream processing, log parsing, or any scenario where you process segments sequentially and don't need all parts in memory simultaneously.

EnumerateSplits with ReadOnlySpan

The EnumerateSplits overload also accepts a ReadOnlySpan<char>, enabling zero-allocation splitting from any memory source -- stack-allocated buffers, ArrayPool slices, or memory-mapped files. The ranges returned point into the original span, so you can process each segment without any string materialization:

ReadOnlySpan<char> span = "one::two::three".AsSpan();
var regex = new Regex(@"::");

foreach (var range in regex.EnumerateSplits(span))
{
    var segment = span[range];
    Console.WriteLine(segment.ToString());
}
// one
// two
// three

Practical Examples

The patterns below are production-ready implementations. Each is written with [GeneratedRegex] for best performance, handles edge cases conservatively, and uses a timeout to guard against unexpected input patterns.

Normalizing Whitespace

[GeneratedRegex(@"s+", RegexOptions.None, matchTimeoutMilliseconds: 200)]
private static partial Regex WhitespacePattern();

public static string NormalizeWhitespace(string input)
    => WhitespacePattern().Replace(input.Trim(), " ");

Converting camelCase to kebab-case

The (?<=[a-z0-9])(?=[A-Z]) pattern is a zero-width assertion that finds the boundary between a lowercase character and an uppercase character. Inserting a hyphen at that position, then lowercasing the result, produces consistent kebab-case. This is useful for generating CSS class names or URL slugs from C# property names: [GeneratedRegex(@"(?<=[a-z0-9])(?=[A-Z])", RegexOptions.None, matchTimeoutMilliseconds: 200)] private static partial Regex CamelToKebabPattern();

public static string ToKebabCase(string input) ⇒ CamelToKebabPattern().Replace(input, "-").ToLowerInvariant();

// "myPropertyName" → "my-property-name"


### Scrubbing Sensitive Data from Logs

Log scrubbing is a security-critical use case where you must redact sensitive values before writing to any log sink. The lookbehind assertion `(?<=(password|token|key)=)` ensures only the value portion is redacted -- the key name is preserved, making logs readable while secrets stay protected. Always use `IgnoreCase` here since HTTP parameter names are case-insensitive:
[GeneratedRegex(@"(?<=(password|token|key)=)[^s&""]+", RegexOptions.IgnoreCase, matchTimeoutMilliseconds: 500)]
private static partial Regex SensitiveValuePattern();

public static string ScrubSecrets(string logLine)
    => SensitiveValuePattern().Replace(logLine, "***");

// "token=abc123&other=value" -> "token=***&other=value"

Splitting CSV (Simple, Without Quoted Fields)

For CSV data that doesn't contain quoted fields with embedded commas, a simple regex split is sufficient and significantly more flexible than string.Split(',') because it handles optional whitespace after delimiters. For production CSV parsing with quoted fields, use a dedicated CSV library rather than regex: [GeneratedRegex(@",s*", RegexOptions.None, matchTimeoutMilliseconds: 200)] private static partial Regex CsvSplitPattern();

public static string[] SplitCsv(string line) ⇒ CsvSplitPattern().Split(line);


---

## Composing Replace with Observer Pattern

In event-driven architectures, you might want each match in a Replace operation to trigger downstream processing. Combining `MatchEvaluator` with the [Observer Design Pattern in C#](https://www.devleader.ca/2026/03/26/observer-design-pattern-in-c-complete-guide-with-examples) lets subscribers react to each substitution without coupling the replacement logic to downstream consumers.

Similarly, when organizing code that performs multiple text transformations in sequence, the [Decorator Design Pattern in C#](https://www.devleader.ca/2026/03/14/decorator-design-pattern-in-c-complete-guide-with-examples) is a natural fit -- each decorator wraps the previous transformation, adding one rule at a time.

---

---

## Building Multi-Step Text Transformation Pipelines

Real-world text processing rarely involves a single replacement. You often need a chain of transformations: normalize whitespace, strip HTML entities, reformat dates, sanitize special characters, and apply domain-specific substitutions -- all in sequence. The question is whether to chain `Regex.Replace` calls, combine patterns, or use a different approach entirely.

Chaining `Replace` calls is the simplest approach for a small number of independent transformations:

```csharp
public static string NormalizeInput(string input)
{
    // Each pass is independent -- order matters only when transforms interact
    input = WhitespacePattern().Replace(input.Trim(), " ");
    input = HtmlEntityPattern().Replace(input, DecodeHtmlEntity);
    input = DateFormatPattern().Replace(input, @"${year}-${month}-${day}");
    return input;
}

This is readable and testable -- each pattern can be tested independently. The drawback is multiple passes over the string. For short strings (a few hundred bytes), this is negligible. For long strings processed thousands of times per second, measuring is worthwhile.

A single-pass approach uses one MatchEvaluator with a combined alternation pattern:

var combinedPattern = new Regex(
    @"(?<whitespace>s{2,})" +
    @"|(?<entity>&amp;|&lt;|&gt;|&quot;)" +
    @"|(?<date>(?<y>d{4})/(?<m>d{2})/(?<d>d{2}))",
    RegexOptions.None, TimeSpan.FromSeconds(1));

string result = combinedPattern.Replace(input, m =>
{
    if (m.Groups["whitespace"].Success) return " ";
    if (m.Groups["entity"].Success) return DecodeEntity(m.Value);
    if (m.Groups["date"].Success)
        return $"{m.Groups["y"]}-{m.Groups["m"]}-{m.Groups["d"]}";
    return m.Value;
});

The single-pass approach makes one traversal of the string but produces a more complex, harder-to-maintain pattern. For most applications, the chained approach is preferable -- profile before optimizing.

Pipelines also benefit from composition. If you're following the Decorator Design Pattern in C#, each decorator wraps the input/output of the previous transformation, letting you build and test each step independently.


Tokenizing and Parsing Structured Text with Split

Regex.Split shines in scenarios where string.Split can't express the delimiter. Several real-world formats use variable delimiters that benefit from regex splitting:

CSV with optional whitespace: Standard CSV fields are comma-delimited, but editors sometimes add spaces around commas. Regex.Split(line, @"s*,s*") handles both "a,b" and "a , b" without preprocessing.

Log field splitting: Structured log formats often use mixed delimiters -- pipes, spaces, colons in different positions. A pattern like @"[s|:=]+" splits on any combination of these, regardless of order or count.

Camel case to words: The pattern @"(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])" splits "MyHTTPClient" into ["My", "HTTP", "Client"] using zero-width lookaheads without consuming any characters.

Key-value pair extraction: Configuration strings like "key1=val1;key2=val2,key3=val3" can be split on @"[;,]" to get individual key=value pairs, which are then further split on "=".

When you need the delimiter itself preserved in the result, use a capturing group in the pattern. Regex.Split includes captured text in the result array interleaved between the segments:

// Split CSV line while preserving the comma type
var result = Regex.Split("first,second;third", @"([,;])");
// ["first", ",", "second", ";", "third"]

This is useful for round-trip text processing where you need to reconstruct the original with modified segments but preserve the original delimiters.

When processing very large inputs, pair Split with EnumerateSplits (if .NET 8+) to avoid allocating the full result array. Iterate over Range values and process each segment as a ReadOnlySpan<char>, writing it to a StringBuilder or output stream without ever materializing the substring.


Replacing Without Regex: When to Reconsider

Regex is powerful but has overhead. For simple literal replacements, string.Replace is always faster. For splitting on a single character, string.Split(char) beats Regex.Split. Use regex when:

  • The delimiter is a pattern, not a literal
  • The replacement depends on the match content (MatchEvaluator)
  • You need named captures in the replacement
  • You need to limit or position the replacement

For complex parsing tasks that go beyond what regex can cleanly handle -- like nested structures or context-dependent grammar -- consider a proper parser library. If you're building plugin-based architectures in C#, you might even make each plugin contribute its own text transformation rules.


FAQ

How do I use backreferences in C# Regex.Replace?

In the replacement string, use $1, $2 for numbered groups or ${name} for named groups. For example, regex.Replace(input, "${day}/${month}/${year}") rearranges a date where the groups are named day, month, and year.

What is a MatchEvaluator in C#?

MatchEvaluator is a Func<Match, string> delegate. You pass it to Regex.Replace instead of a replacement string, giving you full control over what each match becomes. The lambda receives the Match object so you can access groups, apply logic, call external APIs, or build the replacement string dynamically.

What is the difference between Regex.Split and string.Split in C#?

string.Split splits on literal characters or strings. Regex.Split splits on any pattern. Regex.Split also supports interleaving captured groups in the result (by using capturing groups in the pattern). For simple splits, string.Split is faster. For complex delimiters, use Regex.Split.

What is EnumerateSplits in .NET 8?

Regex.EnumerateSplits is a .NET 8+ API that splits a string using regex but returns Range values instead of a string array. This is zero-allocation -- no substrings are created. You use the ranges to slice the original string or span yourself. Ideal for high-throughput text processing pipelines.

Can I limit how many replacements Regex.Replace makes?

Yes. The overload regex.Replace(input, replacement, count) makes at most count replacements, from left to right. There's also a startat parameter to begin replacements at a specific character index.

How do I include captured groups in a Regex.Replace string?

Use substitution tokens: $1 for group 1, $2 for group 2, or ${name} for named groups. $0 or $& inserts the entire match. For example, to surround a match in brackets: regex.Replace(input, "[$0]").

Is Regex.Replace thread-safe in C#?

Yes, provided you're using an instance Regex (or a [GeneratedRegex] generated static). The Regex class is immutable after construction -- all instance methods are thread-safe. The MatchEvaluator delegate itself must be thread-safe if it accesses shared state.

C# Regex: Complete Guide to Regular Expressions in .NET

Master C# Regex with this complete guide covering pattern syntax, RegexOptions, GeneratedRegex, performance, and real .NET code examples.

Regex.Match, Matches, and IsMatch in C#: Named Groups and Capture Collections

Learn Regex.Match, Matches, and IsMatch in C# with named groups, capture collections, and .NET 7 EnumerateMatches for zero-allocation matching.

C# Regex Lookahead, Lookbehind, and Advanced Pattern Syntax

Master C# regex lookahead, lookbehind, backreferences, and zero-width assertions with practical .NET code examples for advanced pattern matching.

An error has occurred. This application may no longer respond until reloaded. Reload