Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added method to detect if a yaml fragment is a sequence #720

Merged
merged 2 commits into from
Sep 18, 2024

Conversation

mizrael
Copy link
Contributor

@mizrael mizrael commented Sep 18, 2024

Added a new CheckIsSequence method to detect if a yaml fragment is a sequence. It internally creates a new YamlDotNet deserializer that uses the custom type converter YamlSequenceTesterConverter.
This converter will simply check if the YAML fragment starts with the SequenceStart event and ends with SequenceEnd.

The following benchmark shows the results of a few deserialization methods:

  • Direct deserialization to IEnumerable<object> using a basic YamlDotNet deserializer
  • Direct deserialization to object[] using a basic YamlDotNet deserializer
  • Direct deserialization to Dictionary<string,object>[] using a basic YamlDotNet deserializer
  • the newly introduced CheckIsSequence method
  • deserialization using the v2 deserializer
  • deserialization using the v3 deserializer
    image

results from CheckIsSequence are not so different from Direct deserialization to IEnumerable<object>. We might decide to remove the custom YamlSequenceTesterConverter in the future and rely directly on the base library instead.

@mizrael mizrael requested review from a team as code owners September 18, 2024 16:35
@mizrael mizrael merged commit 1b70d5c into master Sep 18, 2024
4 checks passed
@mizrael mizrael deleted the users/davgui/yaml-sequence-check branch September 18, 2024 16:46
[DataRow(@"_TestData/SchemaV3_0/Examples/Src/Components/MyHeaderComponent.pa.yaml", false)]
[DataRow(@"_TestData/ValidYaml-CI/With-list-of-controls.pa.yaml", true)]
[DataRow(@"_TestData/InvalidYaml/Dupliacte-Keys.pa.yaml", false)]
public void IsSequenceCheckShouldWorkAsExpected(string path, bool expected)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please simplify this unit tests so it doesn't require external files for the baseline cases.
use targeted input strings for testing, don't use existing large files as these can change due to their purpose etc.

var results = deserializer.Deserialize<object[]>(reader);
return results is not null;
}
catch (YamlException)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should not need to use exceptions as control-flow.

Another design option would be to deseralize into a custom type, register a converter for that type, and in the converter populate a property which has the 'result' of the detection.

e..g

private record SequenceDetectorResult
{
  public required bool IsSequence { get; init; }
}


public object? ReadYaml(IParser parser, Type type)
{
if (parser.Current is not SequenceStart)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when using the SequenceDetectorResult pattern I mentioned above, this converter could just have this SequenceStart check, completely skip the contents, and then return the result value.

the benefit here is that there wouldn't be an exception thrown for code flow control.

@joem-msft
Copy link
Contributor

I think this is a simple imlpementation that doesn't require calling the Deserializer infrastructure:

    public static bool DetectIsSequenceAtRoot(string yaml)
    {
        using var reader = new StringReader(yaml);
        return DetectIsSequenceAtRoot(reader);
    }

    public static bool DetectIsSequenceAtRoot(StringReader reader)
    {
        var parser = new Parser(reader);

        // Need to consume the StreamStart and DocumentStart events to get to the root sequence
        parser.TryConsume<StreamStart>(out _);
        parser.TryConsume<DocumentStart>(out _);

        return parser.TryConsume<SequenceStart>(out _);
    }

@mizrael mizrael mentioned this pull request Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants