Skip to content

Commit

Permalink
docs: Update README
Browse files Browse the repository at this point in the history
Signed-off-by: Dmitry Dygalo <[email protected]>
  • Loading branch information
Stranger6667 committed Sep 14, 2024
1 parent 52de3ae commit 2caf2d9
Show file tree
Hide file tree
Showing 7 changed files with 349 additions and 127 deletions.
3 changes: 1 addition & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
[<img alt="docs.rs" src="https://img.shields.io/badge/docs.rs-jsonschema-66c2a5?style=flat-square&labelColor=555555&logo=docs.rs" height="20">](https://docs.rs/jsonschema)
[<img alt="build status" src="https://img.shields.io/github/actions/workflow/status/Stranger6667/jsonschema-rs/ci.yml?branch=master&style=flat-square" height="20">](https://github.com/Stranger6667/jsonschema-rs/actions?query=branch%3Amaster)
[<img alt="codecov.io" src="https://img.shields.io/codecov/c/gh/Stranger6667/jsonschema-rs?logo=codecov&style=flat-square&token=B1EnafGlRL" height="20">](https://app.codecov.io/github/Stranger6667/jsonschema-rs)
<img alt="Supported Dialects" src="https://img.shields.io/endpoint?url=https%3A%2F%2Fbowtie.report%2Fbadges%2Frust-jsonschema%2Fsupported_versions.json&style=flat-square">
[<img alt="Supported Dialects" src="https://img.shields.io/endpoint?url=https%3A%2F%2Fbowtie.report%2Fbadges%2Frust-jsonschema%2Fsupported_versions.json&style=flat-square">](https://bowtie.report/#/implementations/rust-jsonschema)

A high-performance JSON Schema validator for Rust.

Expand Down Expand Up @@ -47,7 +47,6 @@ See more usage examples in the [documentation](https://docs.rs/jsonschema).

## Highlights

- 🚀 High-performance validation
- 📚 Support for popular JSON Schema drafts
- 🔧 Custom keywords and format validators
- 🌐 Remote reference fetching (network/file)
Expand Down
4 changes: 0 additions & 4 deletions crates/benchmark-suite/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,10 +66,6 @@ Notes:

You can find benchmark code in [benches/](benches/), Rust version is `1.81`.

## Purpose

The `benchmark-suite` crate provides a standardized way to measure and compare the performance of various JSON Schema validation libraries in Rust. It helps in identifying performance bottlenecks and guiding optimization efforts for the `jsonschema` crate.

## Contributing

Contributions to improve, expand, or optimize the benchmark suite are welcome. This includes adding new benchmarks, ensuring fair representation of real-world use cases, and optimizing the configuration and usage of benchmarked libraries. Such efforts are highly appreciated as they ensure accurate and meaningful performance comparisons.
Expand Down
74 changes: 74 additions & 0 deletions crates/jsonschema-py/BENCHMARKS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Benchmark Suite

A benchmarking suite for comparing different Python JSON Schema implementations.

## Implementations

- `jsonschema-rs` (latest version in this repo)
- [jsonschema](https://pypi.org/project/jsonschema/) (v4.23.0)
- [fastjsonschema](https://pypi.org/project/fastjsonschema/) (v2.20.0)

## Usage

Install the dependencies:

```console
$ pip install -e ".[bench]"
```

Run the benchmarks:

```console
$ pytest benches/bench.py
```

## Overview

| Benchmark | Description | Schema Size | Instance Size |
|----------|------------------------------------------------|-------------|---------------|
| OpenAPI | Zuora API validated against OpenAPI 3.0 schema | 18 KB | 4.5 MB |
| Swagger | Kubernetes API (v1.10.0) with Swagger schema | 25 KB | 3.0 MB |
| GeoJSON | Canadian border in GeoJSON format | 4.8 KB | 2.1 MB |
| CITM | Concert data catalog with inferred schema | 2.3 KB | 501 KB |
| Fast | From fastjsonschema benchmarks (valid/invalid) | 595 B | 55 B / 60 B |

Sources:
- OpenAPI: [Zuora](https://github.com/APIs-guru/openapi-directory/blob/master/APIs/zuora.com/2021-04-23/openapi.yaml), [Schema](https://github.com/OAI/OpenAPI-Specification/blob/main/schemas/v3.0/schema.json)
- Swagger: [Kubernetes](https://raw.githubusercontent.com/APIs-guru/openapi-directory/master/APIs/kubernetes.io/v1.10.0/swagger.yaml), [Schema](https://github.com/OAI/OpenAPI-Specification/blob/main/schemas/v2.0/schema.json)
- GeoJSON: [Schema](https://geojson.org/schema/FeatureCollection.json)
- CITM: Schema inferred via [infers-jsonschema](https://github.com/Stranger6667/infers-jsonschema)
- Fast: [fastjsonschema benchmarks](https://github.com/horejsek/python-fastjsonschema/blob/master/performance.py#L15)

## Results

### Comparison with Other Libraries

| Benchmark | fastjsonschema | jsonschema | jsonschema-rs |
|---------------|----------------|---------------|--------------------------|
| OpenAPI | - (1) | 1477.92 ms (**x92.70**) | 15.94 ms |
| Swagger | - (1) | 2586.88 ms (**x177.61**)| 14.56 ms |
| Canada (GeoJSON) | 22.64 ms (**x5.03**) | 1775.93 ms (**x394.76**) | 4.50 ms |
| CITM Catalog | 10.16 ms (**x1.92**) | 178.60 ms (**x33.73**) | 5.29 ms |
| Fast (Valid) | 3.73 µs (**x3.38**) | 83.84 µs (**x75.94**) | 1.10 µs |
| Fast (Invalid)| 4.24 µs (**x2.77**) | 83.11 µs (**x54.18**) | 1.53 µs |

### jsonschema-rs Performance: `validate` vs `is_valid`

| Benchmark | validate | is_valid | Speedup |
|---------------|------------|------------|---------|
| OpenAPI | 15.94 ms | 15.49 ms | 1.03x |
| Swagger | 14.56 ms | 14.42 ms | 1.01x |
| Canada (GeoJSON) | 4.50 ms | 4.46 ms | 1.01x |
| CITM Catalog | 5.29 ms | 3.01 ms | 1.76x |
| Fast (Valid) | 1.10 µs | 696.00 ns | 1.59x |
| Fast (Invalid)| 1.53 µs | 1.08 µs | 1.42x |

Notes:

1. `fastjsonschema` fails to compile the Open API spec due to the presence of the `uri-reference` format (that is not defined in Draft 4). However, unknown formats are explicitly supported by the spec.

You can find benchmark code in [benches/](benches/), Python version `3.12.5`, Rust version `1.81`.

## Contributing

Contributions to improve, expand, or optimize the benchmark suite are welcome. This includes adding new benchmarks, ensuring fair representation of real-world use cases, and optimizing the configuration and usage of benchmarked libraries. Such efforts are highly appreciated as they ensure accurate and meaningful performance comparisons.
165 changes: 77 additions & 88 deletions crates/jsonschema-py/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,53 @@
[![Version](https://img.shields.io/pypi/v/jsonschema-rs.svg)](https://pypi.org/project/jsonschema-rs/)
[![Python versions](https://img.shields.io/pypi/pyversions/jsonschema-rs.svg)](https://pypi.org/project/jsonschema-rs/)
[![License](https://img.shields.io/pypi/l/jsonschema-rs.svg)](https://opensource.org/licenses/MIT)
![Supported Dialects](https://img.shields.io/endpoint?url=https%3A%2F%2Fbowtie.report%2Fbadges%2Frust-jsonschema%2Fsupported_versions.json)

Fast JSON Schema validation for Python implemented in Rust.
A high-performance JSON Schema validator for Python.

Supported drafts:
```python
import jsonschema_rs

validator = jsonschema_rs.JSONSchema({"minimum": 42})

# Boolean result
validator.is_valid(45)

# Raise a ValidationError
validator.validate(41)
# ValidationError: 41 is less than the minimum of 42
#
# Failed validating "minimum" in schema
#
# On instance:
# 41

# Iterate over all validation errors
for error in validator.iter_errors(40):
print(f"Error: {error}")
```

## Highlights

- 📚 Support for popular JSON Schema drafts
- 🌐 Remote reference fetching (network/file)
- 🔧 Custom format validators

### Supported drafts

Compliance levels vary across drafts, with newer versions having some unimplemented keywords.

- ![Draft 2020-12](https://img.shields.io/endpoint?url=https%3A%2F%2Fbowtie.report%2Fbadges%2Frust-jsonschema%2Fcompliance%2Fdraft2020-12.json)
- ![Draft 2019-09](https://img.shields.io/endpoint?url=https%3A%2F%2Fbowtie.report%2Fbadges%2Frust-jsonschema%2Fcompliance%2Fdraft2019-09.json)
- ![Draft 7](https://img.shields.io/endpoint?url=https%3A%2F%2Fbowtie.report%2Fbadges%2Frust-jsonschema%2Fcompliance%2Fdraft7.json)
- ![Draft 6](https://img.shields.io/endpoint?url=https%3A%2F%2Fbowtie.report%2Fbadges%2Frust-jsonschema%2Fcompliance%2Fdraft6.json)
- ![Draft 4](https://img.shields.io/endpoint?url=https%3A%2F%2Fbowtie.report%2Fbadges%2Frust-jsonschema%2Fcompliance%2Fdraft4.json)

- Draft 7
- Draft 6
- Draft 4
You can check the current status on the [Bowtie Report](https://bowtie.report/#/implementations/rust-jsonschema).

There are some notable restrictions at the moment:
## Limitations

- The underlying Rust crate doesn't support arbitrary precision integers yet, which may lead to `SystemError` when such value is used;
- Unicode surrogates are not supported;
- No support for arbitrary precision numbers

## Installation

Expand All @@ -28,36 +62,34 @@ pip install jsonschema-rs

## Usage

To check if the input document is valid:
If you have a schema as a JSON string, then you could use
`jsonschema_rs.JSONSchema.from_str` to avoid parsing on the
Python side:

```python
import jsonschema_rs

validator = jsonschema_rs.JSONSchema({"minimum": 42})
validator.is_valid(45) # True
validator = jsonschema_rs.JSONSchema.from_str('{"minimum": 42}')
...
```

or:
You can specify a custom JSON Schema draft using the `draft` argument:

```python
import jsonschema_rs

validator = jsonschema_rs.JSONSchema({"minimum": 42})
validator.validate(41) # raises ValidationError
validator = jsonschema_rs.JSONSchema(
{"minimum": 42},
draft=jsonschema_rs.Draft7
)
```

If you have a schema as a JSON string, then you could use
`jsonschema_rs.JSONSchema.from_str` to avoid parsing on the
Python side:
JSON Schema allows for format validation through the `format` keyword. While `jsonschema-rs`
provides built-in validators for standard formats, you can also define custom format validators
for domain-specific string formats.

```python
import jsonschema_rs
To implement a custom format validator:

validator = jsonschema_rs.JSONSchema.from_str('{"minimum": 42}')
...
```

You can define custom format checkers:
1. Define a function that takes a `str` and returns a `bool`.
2. Pass it with the `formats` argument.

```python
import jsonschema_rs
Expand All @@ -77,80 +109,37 @@ validator.is_valid("invalid") # False

## Performance

According to our benchmarks, `jsonschema-rs` is usually faster than
existing alternatives in real-life scenarios.

However, for small schemas & inputs it might be slower than
`fastjsonschema` or `jsonschema` on PyPy.

### Input values and schemas
`jsonschema-rs` is designed for high performance, outperforming other Python JSON Schema validators in most scenarios:

- [Zuora](https://github.com/APIs-guru/openapi-directory/blob/master/APIs/zuora.com/2021-04-23/openapi.yaml) OpenAPI schema (`zuora.json`). Validated against [OpenAPI 3.0 JSON Schema](https://github.com/OAI/OpenAPI-Specification/blob/main/schemas/v3.0/schema.json) (`openapi.json`).
- [Kubernetes](https://raw.githubusercontent.com/APIs-guru/openapi-directory/master/APIs/kubernetes.io/v1.10.0/swagger.yaml) Swagger schema (`kubernetes.json`). Validated against [Swagger JSON Schema](https://github.com/OAI/OpenAPI-Specification/blob/main/schemas/v2.0/schema.json) (`swagger.json`).
- Canadian border in GeoJSON format (`canada.json`). Schema is taken from the [GeoJSON website](https://geojson.org/schema/FeatureCollection.json) (`geojson.json`).
- Concert data catalog (`citm_catalog.json`). Schema is inferred via [infers-jsonschema](https://github.com/Stranger6667/infers-jsonschema) & manually adjusted (`citm_catalog_schema.json`).
- `Fast` is taken from [fastjsonschema benchmarks](https://github.com/horejsek/python-fastjsonschema/blob/master/performance.py#L15) (`fast_schema.json`, `fast_valid.json` and `fast_invalid.json`).
- Up to **30-390x** faster than `jsonschema` for complex schemas and large instances
- Generally 2-5x faster than `fastjsonschema` on CPython
- Comparable or slightly slower performance for very small schemas

| Case | Schema size | Instance size |
| ---------------- | ------------- | --------------- |
| OpenAPI | 18 KB | 4.5 MB |
| Swagger | 25 KB | 3.0 MB |
| Canada | 4.8 KB | 2.1 MB |
| CITM catalog | 2.3 KB | 501 KB |
| Fast (valid) | 595 B | 55 B |
| Fast (invalid) | 595 B | 60 B |
For detailed benchmarks, see our [full performance comparison](BENCHMARKS.md).

Compiled validators (when the input schema is compiled once and reused
later). `jsonschema-rs` comes in three variants in the tables below:

- `validate`. This method raises `ValidationError` on errors or returns `None` on their absence.
- `is_valid`. A faster method that returns a boolean result whether the instance is valid.
- `overhead`. Only transforms data to underlying Rust types and do not perform any validation. Shows the Python -> Rust data conversion cost.

Ratios are given against the `validate` variant.

Small schemas:

| library | `true` | `{"minimum": 10}` | `Fast (valid)` | `Fast (invalid)` |
|---------------------------|-----------------------|------------------------|------------------------|------------------------|
| jsonschema-rs\[validate\] | 93.84 ns | 94.83 ns | 1.2 us | 1.84 us |
| jsonschema-rs\[is_valid\] | 70.22 ns (**x0.74**) | 68.26 ns (**x0.71**) | 688.70 ns (**x0.57**) | 1.26 us (**x0.68**) |
| jsonschema-rs\[overhead\] | 65.27 ns (**x0.69**) | 66.90 ns (**x0.70**) | 461.53 ns (**x0.38**) | 925.16 ns (**x0.50**) |
| fastjsonschema\[CPython\] | 58.19 ns (**x0.62**) | 105.77 ns (**x1.11**) | 3.98 us (**x3.31**) | 4.57 us (**x2.48**) |
| fastjsonschema\[PyPy\] | 10.39 ns (**x0.11**) | 34.96 ns (**x0.36**) | 866 ns (**x0.72**) | 916 ns (**x0.49**) |
| jsonschema\[CPython\] | 235.06 ns (**x2.50**)| 1.86 us (**x19.6**) | 56.26 us (**x46.88**) | 59.39 us (**x32.27**) |
| jsonschema\[PyPy\] | 40.83 ns (**x0.43**) | 232.41 ns (**x2.45**) | 21.82 us (**x18.18**) | 22.23 us (**x12.08**) |

Large schemas:
## Python support

| library | `Zuora (OpenAPI)` | `Kubernetes (Swagger)` | `Canada (GeoJSON)` | `CITM catalog` |
|---------------------------|------------------------|------------------------|------------------------|------------------------|
| jsonschema-rs\[validate\] | 17.311 ms | 15.194 ms | 5.018 ms | 4.765 ms |
| jsonschema-rs\[is_valid\] | 16.605 ms (**x0.95**) | 12.610 ms (**x0.82**) | 4.954 ms (**x0.98**) | 2.792 ms (**x0.58**) |
| jsonschema-rs\[overhead\] | 12.017 ms (**x0.69**) | 8.005 ms (**x0.52**) | 3.702 ms (**x0.73**) | 2.303 ms (**x0.48**) |
| fastjsonschema\[CPython\] | -- (1) | 90.305 ms (**x5.94**) | 32.389 ms (**6.45**) | 12.020 ms (**x2.52**) |
| fastjsonschema\[PyPy\] | -- (1) | 37.204 ms (**x2.44**) | 8.450 ms (**x1.68**) | 4.888 ms (**x1.02**) |
| jsonschema\[CPython\] | 764.172 ms (**x44.14**)| 1.063 s (**x69.96**) | 1.301 s (**x259.26**) | 115.362 ms (**x24.21**)|
| jsonschema\[PyPy\] | 604.557 ms (**x34.92**)| 619.744 ms (**x40.78**)| 524.275 ms (**x104.47**)| 25.275 ms (**x5.30**) |
`jsonschema-rs` supports CPython 3.8, 3.9, 3.10, 3.11, and 3.12.

Notes:
## Support

1. `fastjsonschema` fails to compile the Open API spec due to the presence of the `uri-reference` format (that is not defined in Draft 4). However, unknown formats are [explicitly supported](https://tools.ietf.org/html/draft-fge-json-schema-validation-00#section-7.1) by the spec.
If you have questions, need help, or want to suggest improvements, please use [GitHub Discussions](https://github.com/Stranger6667/jsonschema-rs/discussions).

The bigger the input is the bigger is performance win. You can take a look at benchmarks in `benches/bench.py`.
## Sponsorship

Package versions:
If you find `jsonschema-rs` useful, please consider [sponsoring its development](https://github.com/sponsors/Stranger6667).

- `jsonschema-rs` - latest version from the repository
- `jsonschema` - `3.2.0`
- `fastjsonschema` - `2.15.1`
## Contributing

Measured with stable Rust 1.56, CPython 3.9.7 / PyPy3 7.3.6 on Intel i8700K
We welcome contributions! Here's how you can help:

## Python support
- Share your use cases
- Implement missing keywords
- Fix failing test cases from the [JSON Schema test suite](https://bowtie.report/#/implementations/rust-jsonschema)

`jsonschema-rs` supports CPython 3.8, 3.9, 3.10, 3.11, and 3.12.
See [CONTRIBUTING.md](../../CONTRIBUTING.md) for more details.

## License

The code in this project is licensed under [MIT license](https://opensource.org/licenses/MIT). By contributing to `jsonschema-rs`, you agree that your contributions will be licensed under its MIT license.
Licensed under [MIT License](LICENSE).

39 changes: 7 additions & 32 deletions crates/jsonschema-py/benches/bench.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
import json
import sys
from contextlib import suppress
from functools import partial

import fastjsonschema
import jsonschema
Expand Down Expand Up @@ -53,12 +52,6 @@ def load_from_benches(filename, loader=load_json):
5,
]


@pytest.fixture(params=[True, False], ids=("compiled", "raw"))
def is_compiled(request):
return request.param


if jsonschema_rs is not None:
variants = [
"jsonschema-rs-is-valid",
Expand All @@ -79,36 +72,18 @@ def variant(request):


@pytest.fixture
def args(request, variant, is_compiled):
def args(request, variant):
schema, instance = request.node.get_closest_marker("data").args
if (schema is OPENAPI or schema is SWAGGER) and variant == "fastjsonschema":
pytest.skip("fastjsonschema does not support the uri-reference format and errors")
if variant == "jsonschema-rs-is-valid":
if is_compiled:
return jsonschema_rs.JSONSchema(schema).is_valid, instance
else:
return (
jsonschema_rs.is_valid,
schema,
instance,
)
return jsonschema_rs.JSONSchema(schema).is_valid, instance
if variant == "jsonschema-rs-validate":
if is_compiled:
return jsonschema_rs.JSONSchema(schema).validate, instance
else:
return (
jsonschema_rs.validate,
schema,
instance,
)
return jsonschema_rs.JSONSchema(schema).validate, instance
if variant == "jsonschema":
if is_compiled:
return jsonschema.validators.validator_for(schema)(schema).is_valid, instance
else:
return jsonschema.validate, instance, schema
return jsonschema.validators.validator_for(schema)(schema).is_valid, instance
if variant == "fastjsonschema":
if is_compiled:
return fastjsonschema.compile(schema, use_default=False), instance
else:
return partial(fastjsonschema.validate, use_default=False), schema, instance
return fastjsonschema.compile(schema, use_default=False), instance


@pytest.mark.parametrize(
Expand Down
Loading

0 comments on commit 2caf2d9

Please sign in to comment.