Before Span
, we used String.Substring
to get a part of a string, which allocates a new string and copies the characters to it.
Now we can just slice the Span
using the range operator, an Span
to a method that accepts a ReadOnlySpan<char>
parameter, such as Int32.Parse
. See Basics for a partial list of Span
-aware APIs.
int.Parse("12345".AsSpan()[1..3]) // results in 23
const string text = "abcdefghijklmnopq";
string ConcatSubstring() => text[10..] + "---" + text[..5];
string ConcatAsSpan() => string.Concat(text.AsSpan()[10..], "---", text.AsSpan()[..5]);
BenchmarkDotnet results - note the reduction in allocated memory:
Method | Mean | Error | StdDev | Gen 0 | Allocated |
---|---|---|---|---|---|
Substring | 30.28 ns | 0.633 ns | 0.621 ns | 0.0306 | 128 B |
AsSpan | 17.30 ns | 0.377 ns | 0.476 ns | 0.0134 | 56 B |
When creating a new string whose size is known in advance, we can use the String.Create
method to avoid additional allocations. This method works by allocating a string and providing a writable Span
within a delegate. It's safe (meaning the string can't be mutated after creation) since the Span
can't escape from the delegate.
Important
Avoid capturing variables in the delegate as they incur an allocation - pass all data using the state
parameter. Use the static
keyword to enforce this.
static string Reverse(this string s) =>
string.Create(s.Length, s, static (span, str) =>
{
str.AsSpan().CopyTo(span);
span.Reverse();
});
Tip
Can be written using RandomNumberGenerator.GetString(size, "abcdefghijklmnopqrstuvwxyz")
, which is also cryptographically secure.
static string GetRandomString(int size, char min = 'a', char max = 'z') =>
string.Create(size, (min, max), static (span, state) =>
{
for (int i = 0; i < span.Length; i++)
{
span[i] = (char)Random.Shared.Next((int)state.min, (int)state.max + 1);
}
});
ref struct
s such as Span
can't be used in generic type parameters (since there's currently no way to prevent boxing), so if we want to base one string on another, we'll have to use unsafe
code - a pointer - in order to pass it into the generic SpanAction<T, TArg>
delegate.
Important
Be careful when using unsafe code - it could easily lead to memory corruption. Extensively cover it with tests.
#pragma warning disable CS8500
unsafe static string ReplaceNonAlphanumeric(this ReadOnlySpan<char> s, char replacement)
{
return string.Create(s.Length, (replacement, spanPtr: (IntPtr)(&s)), static (span, state) =>
{
var sourceSpan = *(ReadOnlySpan<char>*)state.spanPtr;
for (int i = 0; i < span.Length; i++)
{
var c = sourceSpan[i];
span[i] = char.IsLetterOrDigit(c) ? c : state.replacement;
}
});
}
#pragma warning restore CS8500
Using ReadOnlySpan<char>
as the parameter rather than a string, allows us a lot of flexibility - we can stack-allocate the string, and we can transform it without allocations, as we'll see in the next section.
We can use Span<char>
to mutate strings while minimizing allocations. The MemoryExtensions
class contains extension methods that are similar to System.String
methods.
For example, we can use MemoryExtensions.Trim
, and use the range operator to get a substring.
" abcde ".AsSpan().Trim()[..3] // results in "abc"
To use a method that transforms the string, such as ToLowerInvariant
, we can use String.Create
, chaining multiple mutations within the delegate.
static string SubstringToLower(this string s, int length) =>
string.Create(length, (s, length), static (span, state) => state.s.AsSpan()[..state.length].ToLowerInvariant(span));
We can also create an alternative design for the ReplaceNonAlphanumeric
method that does not use unsafe code by adding a parameter for the destination span.
static void ReplaceNonAlphanumeric(this ReadOnlySpan<char> source, Span<char> destination, char replacement)
{
for (int i = 0; i < destination.Length; i++)
{
var c = source[i];
destination[i] = char.IsLetterOrDigit(c) ? c : replacement;
}
}
To use this method, we'll need to allocate the destination span. If our final destination is a String
we can use String.Create
, similarly to how we called ToLowerInvariant
.
string.Create(length: 3, " a^b ", static (span, str) =>
str.AsSpan().Trim()[..span.Length].ReplaceNonAlphanumeric('_', span)); // results in "a_b"
Interpolated string handlers are a C# feature that optimizes string creation by breaking them into multiple "append" calls, rather than allocating a string using String.Format
, and potentially creating multiple interim strings (for example, formatting a number). This allows us to "hack" the compiler into all sorts of interesting optimizations.
Handlers available in .NET:
DefaultInterpolatedStringHandler
is used in aString.Create
overload, and emitted by the compiler for regular interpolated strings.AppendInterpolatedStringHandler
is used inStringBuilder.Append
. No longer is it needed to break an interpolated string into multipleAppend
calls for efficiency.AssertInterpolatedStringHandler
andWriteIfInterpolatedStringHandler
are used inDebug.Assert
andDebug.WriteIf
respectively and completely skip writing according to the condition argument.TryWriteInterpolatedStringHandler
is used byMemoryExtensions.TryWrite
and allows us to efficiently write strings into aSpan<char>
. There's a similar one inUtf8.TryWrite
to write into aSpan<byte>
.
dotNext also includes:
BufferWriterInterpolatedStringHandler
that writes into anIBufferWriter<char>
.PoolingInterpolatedStringHandler
which allows using aMemory<T>
pool rather than allocating new buffers.
Tip
We can reuse any of the above handlers to write custom ones.
The following method checks if the host & port part of a Uri
matches a list of patterns. Rather than using Uri.GetLeftPart
which allocates a string, we'll use TryWrite
. Because it uses an interpolated string handler, there are no additional allocations (for example, to convert the port number into a string). Regex
also works directly with Span<char>
.
[SkipLocalsInit]
bool MatchesUriPattern(IEnumerable<Regex> patterns, Uri uri)
{
var maxLength = uri.Scheme.Length + "://".Length + uri.Host.Length + ":".Length + 5;
using SpanOwner<char> hostAndPort = maxLength <= 256 ? new(stackalloc char[256], maxLength) : new(maxLength);
hostAndPort.Span.TryWrite($"{uri.Scheme}://{uri.Host}:{uri.Port}", out var written);
var hostAndPortSpan = hostAndPort.Span[..written];
foreach (var pattern in patterns)
{
if (pattern.IsMatch(hostAndPortSpan))
{
return true;
}
}
return false;
}
Interpolated string handlers have overloads for ReadOnlySpan<char>
, which allows us to get substrings without allocating new strings.
Tip
Consider String.Create
and String.Concat
before using an interpolated string, as they would be more efficient.
string Format(string str, int num)
{
var index = str.IndexOf(':');
return $"{str.AsSpan()[..index]} {num} {str.AsSpan()[(index + 1)..]}";
}
StringBuilder
is a reference type and often in order to avoid allocating new ones, we pool them, or store them in reusable thread-static fields. As an alternative, we can use DefaultInterpolatedStringHandler
as an append-only value type string builder.
Tip
StringBuilder
works more like a linked list of arrays while DefaultInterpolatedStringHandler
works more like a dynamic array. This means that StringBuilder
might perform better when we can't approximate the final size of the string.
Tip
When possible, it's preferable to use stackalloc
to provide the initial buffer as it performs better. If we don't provide an initial buffer, it uses a rented array with a size calculated from the parameters literalLength
and formattedCount
.
Important
The handler uses ArrayPool<char>
internally when it grows out of the initial buffer, so we must call ToStringAndClear
to return the rented array rather than ToString
.
string BuildString(int count)
{
var initialBuffer = (stackalloc char[256]);
var builder = new DefaultInterpolatedStringHandler(0, 0, CultureInfo.InvariantCulture, initialBuffer);
builder.AppendLiteral("hello");
for (int i = 0; i < count; i++)
{
builder.AppendLiteral(", ");
builder.AppendFormatted(i);
}
return builder.ToStringAndClear();
}
To get a hexadecimal string representing the hash of a string, we'll use the SHA256
algorithm (it could be replaced with others available in .NET).
Important
The methods below will produce different hashes.
We can use MemoryMarshal
to reinterpret the string's char array as bytes (an
However it can (depending on the input) double the time to hash the string as we'll get
static string GetSha256(this string s)
{
var hash = (stackalloc byte[32]);
SHA256.HashData(MemoryMarshal.AsBytes(s.AsSpan()), hash);
return Convert.ToHexString(hash);
}
UTF-8 encoding can produce a lower byte count for many strings, which would make the hash function faster. We'll use SpanOwner
again to stackalloc
or rent an array.
const int StackallocThreshold = 256;
[SkipLocalsInit]
static string GetSha256(this string s)
{
int inputByteCount = Encoding.UTF8.GetByteCount(s);
using SpanOwner<byte> encodedBytes = inputByteCount <= StackallocThreshold ? new(stackalloc byte[StackallocThreshold], inputByteCount) : new(inputByteCount);
int encodedByteCount = Encoding.UTF8.GetBytes(s, encodedBytes.Span);
var hash = (stackalloc byte[32]);
SHA256.HashData(encodedBytes.Span[..encodedByteCount], hash);
return Convert.ToHexString(hash);
}
MemoryExtensions
methods Split
and SplitAny
allow us to split strings with no allocations. Unlike String.Split
, it writes the results to a Span<Range>
, which means we have to pre-allocate the ranges. If there are more matches than the ranges provide, the last range will contain the remainder of the string.
Tip
We're using collection expressions to create a ReadOnlySpan<string>
of separators (of course, it can also be allocated statically once).
var stringToSplit = ";;11==22";
var spanToSplit = stringToSplit.AsSpan();
var ranges = (stackalloc Range[2]);
ReadOnlySpan<string> separators = [ "==", ";;" ];
var count = spanToSplit.SplitAny(ranges, separators, StringSplitOptions.RemoveEmptyEntries);
Debug.Assert(count == 2);
var result = (int.Parse(spanToSplit[ranges[0]]), int.Parse(spanToSplit[ranges[1]])); // results in (11, 22)
Methods like ContainsAny
and IndexOfAny
can benefit from various optimizations, depending on the characters searched. For example, if the value contain only ASCII characters, or whether it's up to 5 characters, or represents a contiguous range (e.g. a-z). Determining the optimization is done once when StringValues
is created.
Many .NET classes, such as JSON parsing, regular expressions and Uri
have been enhanced with it.
private static readonly SearchValues<char> s_lineEndings = SearchValues.Create("\n\r\f\u0085\u2028\u2029");
int CountLineEndings(ReadOnlySpan<char> s)
{
int count = 0;
int pos;
while ((pos = s.IndexOfAny(s_lineEndings)) >= 0)
{
count++;
s = s.Slice(pos + 1);
}
return count;
}