bug(json): re-enabling of `parsing_*` is possible #43

Autoparallel · 2024-08-20T20:03:43Z

Idea

The current way we handle the parsing state does not adequately prevent re-enabling of the state flags parsing_* (currently just parsing_string and parsing_number). This means that invalid JSON can be provided and intended values can be obfuscated.

Example

Consider the (invalid) JSON:

{
    "k": "v" 123 "v",
}

At the moment, the parser will read up to the key "k" and upon reading: will write

[1,1]

to stack position 0.

We will then move through bytes and toggle: parsing_string on and off, then toggle parsing_number on then off, and finally again with parsing_string we see it toggle.

Solution

This can likely be solved by having these parsing states be stack variables in a new index and adding constraints in the following way.

Consider now the sequence:

read `:`  --> [1,1,0,0]
read `"` --> [1,1,1,0]
read `"` --> [1,1,2,0]
read `1` --> FAIL
read `"` --> FAIL

In this case, we write the additional 1 into position 2 of the stack at height 0 indicating we enter a string value. Upon reading the second " we increment position 2 of the stack at height 0 to [1,1,2,0] which indicates we have cleared the string value. At this point, we constrain to not allow any further value parsing.

The interaction with a comma would remain mostly unchanged in that we would require, for instance, seeing [1,1,2,0] before reading a comma and then we'd see:

read `:`  --> [1,1,0,0]
read `"` --> [1,1,1,0]
read `"` --> [1,1,2,0]
read `,` --> [1,0,0,0]

Also, we can repeat this same process for when we are parsing a key instead of a value, and enforce the key is a string. E.g.,

read `{` --> [1,0,0,0]
read `"` --> [1,0,1,0]
read `"` --> [1,0,2,0]
read `:` --> [1,1,0,0]

For instance, in the above explanation if we had instead:

read `:`  --> [1,1,0,0]
read `1` --> [1,1,0,1]
read ` ` --> [1,1,0,2]
read `1` --> FAIL
read `"` --> FAIL

we do the same, using stack position 3 to track parsing_number and a 2 written to this position to indicate clearing that value and allowing for no more value-types to be parsed.

Future Work

Given #32 and #33 this type of implementation described seems more pleasing and conducive to reducing passing invalid JSON through the parser. In those cases, we can add two more stack indices representing parsing_null and parsing_bool which, for example could go like so:

read `:` --> [1,1,0,0,0,0]
read `n` --> [1,1,0,0,1,0]
read `u` --> [1,1,0,0,2,0]
read `l` --> [1,1,0,0,3,0]
read `l` --> [1,1,0,0,4,0]

whereby reading any other non-whitespace ASCII after achieving [1,1,0,0,4,0] results in FAIL.

Easy differentiation in true and false could be handled upon filtering for [x,y,0,0,0,5] as only false can attain 5 in the final stack position here.

Edit: 8/30/24

I think we can compress the stack quite a bit, actually. We need only a stack 3 wide if we do the following encoding:

parsing_string --> [..,..,1] and [..,..,2]
parsing_number --> [..,..,10] and [..,..,20]
parsing_null --> [..,..,100], ... [..,..,400]
"parsing_true" --> [..,..,1000],...[..,..,4000],
"parsing_false" --> [..,..,10000],...[..,..,50000],

These numbers cannot ever overlap with each other in nominal conditions, so we can save 3 x STACK_HEIGHT numbers of field elements. Likely this same sort of compression could be used elsewhere.

The text was updated successfully, but these errors were encountered:

Autoparallel · 2024-08-20T20:04:40Z

Pinned issue as this is a rather invasive endeavor we should decide upon before integrating to many other changes.

Autoparallel · 2024-08-20T20:06:45Z

@lonerapier i'm pinging you here because this has upstream effects into the fetcher/interpreter.

I can tackle these changes if need be, but maybe we discuss

Autoparallel added bug Something isn't working json labels Aug 20, 2024

Autoparallel added this to the JSON Selective Disclosure Template milestone Aug 20, 2024

Autoparallel pinned this issue Aug 20, 2024

Autoparallel mentioned this issue Aug 20, 2024

tracking(json): JSON parsing #25

Open

9 tasks

mattes mentioned this issue Aug 21, 2024

feat: Implement rust-only witness validity checker pluto/web-prover#156

Closed

Autoparallel removed this from the MVP(disclosure): JSON Parser and Interpreter milestone Aug 22, 2024

Autoparallel unpinned this issue Aug 23, 2024

Autoparallel changed the title ~~bug(disclosure): re-enabling of parsing_* is possible~~ bug(json): re-enabling of parsing_* is possible Aug 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug(json): re-enabling of `parsing_*` is possible #43

bug(json): re-enabling of `parsing_*` is possible #43

Autoparallel commented Aug 20, 2024 •

edited

Loading

Autoparallel commented Aug 20, 2024

Autoparallel commented Aug 20, 2024

bug(json): re-enabling of parsing_* is possible #43

bug(json): re-enabling of parsing_* is possible #43

Comments

Autoparallel commented Aug 20, 2024 • edited Loading

Idea

Example

Solution

Future Work

Autoparallel commented Aug 20, 2024

Autoparallel commented Aug 20, 2024

bug(json): re-enabling of `parsing_*` is possible #43

bug(json): re-enabling of `parsing_*` is possible #43

Autoparallel commented Aug 20, 2024 •

edited

Loading