Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multipleOf less than 1 doesn't validate with json_matches_schema #67

Open
2 tasks done
johncmacy opened this issue Jan 2, 2025 · 6 comments
Open
2 tasks done
Labels
bug Something isn't working

Comments

@johncmacy
Copy link

Bug report

  • I confirm this is a bug with Supabase, not with my own application.
  • I confirm I have searched the Docs, GitHub Discussions, and Discord.

Describe the bug

Number schemas with a multipleOf less than 1 don't validate. For example, 17.2 should be valid against the schema {"type": "number","multipleOf": 0.1}, but json_matches_schema returns false.

I have verified the expected behavior using jsonschemavalidator.net.

To Reproduce

Steps to reproduce the behavior, please provide code snippets or a repository:

Run the following in a Postgres database with the extension installed:

SELECT json_matches_schema(
  '{"type": "number","multipleOf": 0.1}',
  '17.2'
);

The above statement returns false.

Expected behavior

The above statement should return true.

Screenshots

N/A

System information

  • PostgreSQL: v16 (Neon)

Additional context

N/A

@johncmacy johncmacy added the bug Something isn't working label Jan 2, 2025
@imor
Copy link
Contributor

imor commented Jan 3, 2025

This is a bug in the jsonschema crate used by pg_jsonschema. Specifically, this function first uses float types to find out if a float value is a multiple of another and only if the result is a NaN value it falls back to fraction::BigFraction, apparantly to save allocations. The float calculations don't always work. E.g. in your specific example values of 17.2 and 0.1 the result of division is 17.2 / 0.1 = 171.99999999999997. This value's mod with 0.1 is 171.99999999999997 % 1 = 0.9999999999999716 which is much larger than f64::EPSILON. The fix is to probably always use BigFraction for these calculations.

@johncmacy would you like to open a bug in the jsonschema repo? @Stranger6667 should be able to help you.

@Stranger6667
Copy link
Contributor

The JSON Schema spec defines multipleOf like this:

The value of "multipleOf" MUST be a number, strictly greater than 0.
A numeric instance is valid only if division by this keyword's value results in an integer.

And validation of numeric instances:

The JSON specification allows numbers with arbitrary precision, and JSON Schema does not add any such bounds. This means that numeric instances processed by JSON Schema can be arbitrarily large and/or have an arbitrarily long decimal part, regardless of the ability of the underlying programming language to deal with such data.

However, the spec does not require specific numeric representations (e.g., IEEE754 vs. decimal). This creates issues with f64 due to its inherent floating-point precision issues. For instance, 17.2 / 0.1 = 171.99999999999997. Since the result is not an exact integer, this would fail multipleOf validation, even though mathematically 17.2 is a multiple of 0.1. Decimal arithmetic avoids this issue by maintaining exact precision during division.

The jsonschema crate defaults to f64 because it relies on the serde_json parser, which uses f64 for number parsing by default. While serde_json has the arbitrary_precision feature to support exact numeric types (it stores String internally), this feature must be enabled at compile-time and is not enabled by default.

In any event, I think that most of the issues related to multipleOf in the jsonschema crate (there are multiple of them already + this one) come from different expectations about the data type - many expect it to behave like decimals but they are double-precision floats. To make the situation better I believe the jsonschema crate can potentially have:

  • better documentation about number representation
  • feature to switch between f64 & decimal

Right now, there is a WIP PR that clarifies the matter. See also this issue.

@imor
Copy link
Contributor

imor commented Jan 3, 2025

feature to switch between f64 & decimal

If jsonschehma exposes a feature flag then pg_jsonschema will enable it to fix this issue.

@johncmacy
Copy link
Author

Thank you both for the explanation. I understand better why it's a problem, but I still think this is a bug that needs to be fixed, not simply a need for better documentation. It sounds like there already are efforts to fix this - do you need me to create an issue, or is that redundant?

FWIW, I'd put my vote in for this package (pg_jsonschema, the Postgres extension) to default to the decimal implementation. IMHO, that would result in the least amount of surprises.

@Stranger6667
Copy link
Contributor

It sounds like there already are efforts to fix this - do you need me to create an issue, or is that redundant?

I think this issue is a good place to track the progress, hence no new issues are needed right now. I plan to consolidate all related information there + clarify a few moments with the JSON Schema folks, then will work on the implementation.

@johncmacy
Copy link
Author

Thanks, much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants