Skip to content

Commit

Permalink
Greatly reduce amount of generated code (#1)
Browse files Browse the repository at this point in the history
### Changes
* A fuzz test ensures* that the fork and the original `serde_cbor` are compatible at the lower level. At the higher level (structs/enums/..), compatibility is ensured by passing all the same tests and also not touching the code.
* Add traits and two different implementations for configuring the encoder/encoder. The standard (static) configuration produces very little code as it avoids generating multiple branches in the encoder/decoder for each runtime configuration.
* Generate only the strictly required decoder code for each deserialized data type. So if a u64 is to be deserialized, only generate code to deserialize integers and fail for other core entities. This is done via the type system with traits and associated constants. Note that the decoder is the largest source of code bloat.
* Tidy up the code generation by reusing code from `ucbor`, which is highly optimized for speed and size.
* All non-configurable behavior was retained to allow fuzzing the new implementation against the original `serde_cbor`. This includes the nuances of modeling structs/newtypes/enums atop of the serde data model.
* Bugs were also retained. Specifically, disallowing packed encoding will also disallow any maps with integer keys, which isn't strictly correct. Ditto doesn't use packed encoding, so this doesn't affect our usage. I left it as it reduces the number of changes needed to make tests and fuzzer compatible.

### Breaking changes
* There's a second generic type argument on the (De)serializer types.
* The standard (non-configured) decoder disallows legacy enums by default.
* The error messages are slightly different

### Theoretical breaking change
* Breaks Deserialize implementations that tell the deserializer they want something but accept something completely different. All reasonable compatibility {str/bytes/seq}, {unsigned/signed/float} compatibilities were implemented to give an extra margin of safety. 
* Example: if the deserialize implementation calls `deserialize_str(visitor)` but would produce a value if the visitor `visit_u64(..)` was called. That is incompatible with this fork.
  • Loading branch information
arthurprs authored Sep 6, 2024
1 parent 347a3f0 commit e8ce605
Show file tree
Hide file tree
Showing 10 changed files with 1,143 additions and 654 deletions.
65 changes: 65 additions & 0 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
name: Continuous integration

on:
push:
branches: [master]
pull_request:

jobs:
test:
name: Tests
runs-on: ubuntu-latest
steps:
- uses: styfle/[email protected]
with:
access_token: ${{ github.token }}
- uses: actions/checkout@v2
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: stable
override: true
- run: cargo test
- run: cargo test --no-default-features --tests
- run: cargo test --all-features

fuzz-tests:
name: Fuzz tests
runs-on: ubuntu-latest
steps:
- uses: styfle/[email protected]
with:
access_token: ${{ github.token }}
- uses: actions/checkout@v2
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: nightly
override: true
- run: cargo install cargo-fuzz
- run: for fuzz_test in `cargo fuzz list`; do cargo fuzz run $fuzz_test -- -max_total_time=60 -detect_leaks=0 -len_control=0 || exit 1; done

lint:
name: Rustfmt & Clippy
runs-on: ubuntu-latest
steps:
- uses: styfle/[email protected]
with:
access_token: ${{ github.token }}
- uses: actions/checkout@v2
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: stable
override: true
- run: rustup component add rustfmt
- uses: actions-rs/cargo@v1
with:
command: fmt
args: --all -- --check
# This is a legacy codebase and clippy has a lot of complaints, disable it for now
# - run: rustup component add clippy
# - uses: actions-rs/cargo@v1
# with:
# command: clippy
# args: -- -D warnings
5 changes: 5 additions & 0 deletions fuzz/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@ cargo-fuzz = true

[dependencies.serde_cbor]
path = ".."

[dependencies.serde_cbor_original]
git = "https://github.com/pyfisch/cbor.git"
package = "serde_cbor"

[dependencies.libfuzzer-sys]
git = "https://github.com/rust-fuzz/libfuzzer-sys.git"

Expand Down
7 changes: 5 additions & 2 deletions fuzz/fuzz_targets/from_reader.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ extern crate serde_cbor;
use serde_cbor::Value;

fuzz_target!(|data: &[u8]| {
let mut data = data;
let _ = serde_cbor::from_reader::<Value, _>(&mut data);
let mut data1 = data;
let mut data2 = data;
let new = serde_cbor::from_reader::<Value, _>(&mut data1).map_err(|_| ());
let original = serde_cbor_original::from_reader::<Value, _>(&mut data2).map_err(|_| ());
assert_eq!(new, original);
});
8 changes: 7 additions & 1 deletion fuzz/fuzz_targets/from_slice.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,11 @@ extern crate serde_cbor;
use serde_cbor::Value;

fuzz_target!(|data: &[u8]| {
let _ = serde_cbor::from_slice::<Value>(data);
let new = serde_cbor::from_slice::<Value>(data).map_err(|_| ());
let original = serde_cbor_original::from_slice::<Value>(data).map_err(|_| ());
assert_eq!(new, original);
assert_eq!(
new.and_then(|v| serde_cbor::to_vec(&v).map_err(|_| ())),
original.and_then(|v| serde_cbor_original::to_vec(&v).map_err(|_| ())),
);
});
Loading

0 comments on commit e8ce605

Please sign in to comment.