Skip to content

Latest commit

 

History

History
116 lines (91 loc) · 5.4 KB

parsing-formatting.md

File metadata and controls

116 lines (91 loc) · 5.4 KB

Parsing and formatting

In this section we'll show examples of parsing and formatting values, by way of writing JsonConverters. System.Text.Json excels at memory management. It extensively uses Span<T> and object pooling to avoid allocations, and we can use these techniques ourselves.

Using Utf8Parser and Utf8Formatter

Suppose we want to create a converter that takes strings and converts them from/to a record:

record MyValue(long Value);

When reading the value, the naive approach is to convert it to a string use Int64.Parse. For writing, the naive approach is to use ToString().

class NaiveMyValueConverter : JsonConverter<MyValue>
{
    public override MyValue Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options) =>
        new MyValue(long.Parse(reader.GetString()));

    public override void Write(Utf8JsonWriter writer, MyValue value, JsonSerializerOptions options) =>
        writer.WriteStringValue(value.Value.ToString());
}

Instead of allocating a string, we can parse the UTF-8 value directly (similarly to how methods like Utf8JsonReader.GetInt64 are implemented). Note that if the value is a Sequence rather than a Span, we'll allocate an array, but that should rarely happen.

For writing values we'll use stackalloc to avoid the allocation, and the UTF-8 formatter to avoid going through String's UTF-16.

Important

The following method will not work for values that require unescaping (for example, \uXXXX). We can use CopyString to get the unescaped string if needed.

class MyValueConverter : JsonConverter<MyValue>
{
    public override MyValue Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options) =>
        (reader.HasValueSequence ? reader.ValueSequence.ToArray() : reader.ValueSpan) is var source &&
        Utf8Parser.TryParse(source, out long value, out var consumed) && consumed == source.Length
            ? new MyValue(value)
            : throw new FormatException();

    [SkipLocalsInit]
    public override void Write(Utf8JsonWriter writer, MyValue value, JsonSerializerOptions options)
    {
        var bytes = (stackalloc byte[128]);
        if (Utf8Formatter.TryFormat(value.Value, bytes, out var written))
        {
            writer.WriteStringValue(bytes[..written]);
        }
        else
        {
            // this would happen if the stackalloc'd size isn't enough (impossible here as any long should fit)
            throw new FormatException();
        }
    }
}

Using JsonSerializer

We can call JsonSerializer with different options inside the converters to efficiently convert values.

Important

Make sure to instantiate JsonSerializerOptions only once, as it's used to cache serialization metadata.

class MyValueConverter : JsonConverter<MyValue>
{
    private static readonly JsonSerializerOptions s_options = new() { NumberHandling = JsonNumberHandling.AllowReadingFromString | JsonNumberHandling.WriteAsString };

    public override MyValue Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options) =>
        new(JsonSerializer.Deserialize<long>(ref reader, s_options));

    public override void Write(Utf8JsonWriter writer, MyValue value, JsonSerializerOptions options) =>
        JsonSerializer.Serialize(writer, value.Value, s_options);
}

Using Encoding and ISpanFormattable

Utf8Parser and Utf8Formatter only work for standard formats, specified using a single character (for example, x for hexadecimal numbers). If we want to use custom formats, we'll have to use methods that encode/decode UTF-16 chars rather than UTF-8 bytes. But fear not - we can still avoid most allocations.

Tip

IUtf8SpanParsable and IUtf8SpanFormattable can be used with custom formats.

Important

Do NOT use ValueSpan or ValueSequence directly to get string data from Utf8JsonReader - it may be not be well-formed, and may require unescaping. Use CopyString instead.

class DateTimeConverter : JsonConverter<DateTime>
{
    private const string DateFormat = "yyyy-MM-dd";
    
    [SkipLocalsInit]
    public override DateTime Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
    {
        var length = reader.HasValueSequence ? checked((int)reader.ValueSequence.Length) : reader.ValueSpan.Length;
        // allocate a buffer on the stack if possible, otherwise use array pool
        using SpanOwner<char> chars = length <= StackallocThreshold ? new(stackalloc char[StackallocThreshold], length) : new(length);
        // copy the string data to the buffer, unescaping and validating it
        var written = reader.CopyString(chars.Span);
        // parse the date
        return DateTime.ParseExact(chars.Span[..written], DateFormat, provider: CultureInfo.InvariantCulture);
    }

    public override void Write(Utf8JsonWriter writer, DateTime value, JsonSerializerOptions options)
    {
        // in this case we know exactly the length the format string will yield
        // so it is safe to allocate its size and assume formatting must succeed
        var chars = (stackalloc char[DateFormat.Length]);
        var success = value.TryFormat(chars, out var written, DateFormat, provider: CultureInfo.InvariantCulture);
        Debug.Assert(success);
        writer.WriteStringValue(chars[..written]);
    }
}