Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zcbor.py: Performance improvements in DataTranslator #479

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 13 additions & 3 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ The functionality is spread across 5 classes:
1. CddlParser
2. CddlXcoder (inherits from CddlParser)
3. DataTranslator (inherits from CddlXcoder)
4. DataDecoder (inherits from DataTranslator)
4. CodeGenerator (inherits from CddlXcoder)
5. CodeRenderer

Expand Down Expand Up @@ -100,7 +101,7 @@ Most of the functionality falls into one of two categories:
- is_unambiguous(): Whether the type is completely specified, i.e. whether we know beforehand exactly how the encoding will look (e.g. `Foo = 5`).

DataTranslator
-----------
--------------

DataTranslator is for handling and manipulating CBOR on the "host".
For example, the user can compose data in YAML or JSON files and have them converted to CBOR and validated against the CDDL.
Expand All @@ -127,15 +128,23 @@ One caveat is that CBOR supports more features than YAML/JSON, namely:

zcbor allows creating bespoke representations via `--yaml-compatibility`, see the README or CLI docs for more info.

Finally, DataTranslator can also generate a separate internal representation using `namedtuple`s to allow browsing CBOR data by the names given in the CDDL.
DataTranslator functionality is tested in [tests/scripts/test_zcbor.py](tests/scripts/test_zcbor.py)

DataDecoder
-----------

DataDecoder contains functions for generating a separate internal representation using `namedtuple`s to allow browsing CBOR data by the names given in the CDDL.
(This is more analogous to how the data is accessed in the C code.)

DataTranslator functionality is tested in [tests/scripts](tests/scripts)
This functionality was originally part of DataTranslator, but was moved because the internal representation was always created but seldom used, and the namedtuples caused a noticeable performance hit.

DataDecoder functionality is tested in [tests/scripts/test_zcbor.py](tests/scripts/test_zcbor.py)

CodeGenerator
-------------

CodeGenerator, like DataTranslator, inherits from CddlXcoder.
It is used to generate C code.
Its primary purpose is to construct the individual decoding/encoding functions for the types specified in the given CDDL document.
It also constructs struct definitions used to hold the decoded data/data to be encoded.

Expand All @@ -158,6 +167,7 @@ repeated_foo() concerns itself with the individual value, while foo() concerns i

When invoking CodeGenerator, the user must decide which types it will need direct access to decode/encode.
These types are called "entry types" and they are typically the "outermost" types, or the types it is expected that the data will have.
CodeGenerator will generate a public function for each entry type.

The user can also use entry types when there are `"BSTR"`s that are CBOR encoded, specified as `Foo = bstr .cbor Bar`.
Usually such strings are automatically decoded/encoded by the generated code, and the objects part of the encompassing struct.
Expand Down
9 changes: 9 additions & 0 deletions MIGRATION_GUIDE.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,14 @@
# zcbor v. 0.9.99

* The following `DataTranslator` functions have been moved to a separate class `DataDecoder`:

* `decode_obj()`
* `decode_str_yaml()`
* `decode_str()`

The split was done for performance reasons (namedtuple objects are slow to create).
This functionality is only relevant when zcbor is imported, so all CLI usage is unaffected.
The `DataDecoder` class is a subclass of `DataTranslator` so it can also do all the the same things as `DataTranslator`, but a bit slower.

# zcbor v. 0.9.0

Expand Down
2 changes: 1 addition & 1 deletion __init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@

from pathlib import Path

from .zcbor.zcbor import CddlValidationError, DataTranslator, main
from .zcbor.zcbor import CddlValidationError, DataTranslator, DataDecoder, main
32 changes: 32 additions & 0 deletions tests/scripts/test_performance.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import zcbor
import cbor2
import cProfile, pstats


try:
import zcbor
except ImportError:
print(
"""
The zcbor package must be installed to run these tests.
During development, install with `pip3 install -e .` to install in a way
that picks up changes in the files without having to reinstall.
"""
)
exit(1)

cddl_contents = """
Foo = int/bool
Bar = [0*1000(Foo)]
"""
message = list(range(500)) + list(bool(i % 2) for i in range(500))
raw_message = cbor2.dumps(message)
cmd_spec = zcbor.DataTranslator.from_cddl(cddl_contents, 3).my_types["Bar"]
# cmd_spec = zcbor.DataDecoder.from_cddl(cddl_contents, 3).my_types["Bar"]

profiler = cProfile.Profile()
profiler.enable()
json_obj = cmd_spec.str_to_json(raw_message)
profiler.disable()

profiler.print_stats()
14 changes: 7 additions & 7 deletions tests/scripts/test_zcbor.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ def decode_file(self, data_path, *cddl_paths):

def decode_string(self, data_string, *cddl_paths):
cddl_str = " ".join((Path(p).read_text(encoding="utf-8") for p in cddl_paths))
self.my_types = zcbor.DataTranslator.from_cddl(cddl_str, 16).my_types
self.my_types = zcbor.DataDecoder.from_cddl(cddl_str, 16).my_types
cddl = self.my_types["SUIT_Envelope_Tagged"]
self.decoded = cddl.decode_str(data_string)

Expand Down Expand Up @@ -1123,7 +1123,7 @@ def test_file_header(self):
class TestOptional(TestCase):
def test_optional_0(self):
with open(p_optional, "r", encoding="utf-8") as f:
cddl_res = zcbor.DataTranslator.from_cddl(f.read(), 16)
cddl_res = zcbor.DataDecoder.from_cddl(f.read(), 16)
cddl = cddl_res.my_types["cfg"]
test_yaml = """
mem_config:
Expand All @@ -1136,7 +1136,7 @@ def test_optional_0(self):

class TestUndefined(TestCase):
def test_undefined_0(self):
cddl_res = zcbor.DataTranslator.from_cddl(
cddl_res = zcbor.DataDecoder.from_cddl(
p_prelude.read_text(encoding="utf-8")
+ "\n"
+ p_corner_cases.read_text(encoding="utf-8"),
Expand All @@ -1154,7 +1154,7 @@ def test_undefined_0(self):

class TestFloat(TestCase):
def test_float_0(self):
cddl_res = zcbor.DataTranslator.from_cddl(
cddl_res = zcbor.DataDecoder.from_cddl(
p_prelude.read_text(encoding="utf-8")
+ "\n"
+ p_corner_cases.read_text(encoding="utf-8"),
Expand Down Expand Up @@ -1243,7 +1243,7 @@ def test_yaml_compatibility(self):

class TestIntmax(TestCase):
def test_intmax1(self):
cddl_res = zcbor.DataTranslator.from_cddl(
cddl_res = zcbor.DataDecoder.from_cddl(
p_prelude.read_text(encoding="utf-8")
+ "\n"
+ p_corner_cases.read_text(encoding="utf-8"),
Expand All @@ -1254,7 +1254,7 @@ def test_intmax1(self):
decoded = cddl.decode_str_yaml(test_yaml)

def test_intmax2(self):
cddl_res = zcbor.DataTranslator.from_cddl(
cddl_res = zcbor.DataDecoder.from_cddl(
p_prelude.read_text(encoding="utf-8")
+ "\n"
+ p_corner_cases.read_text(encoding="utf-8"),
Expand Down Expand Up @@ -1286,7 +1286,7 @@ def test_intmax2(self):

class TestInvalidIdentifiers(TestCase):
def test_invalid_identifiers0(self):
cddl_res = zcbor.DataTranslator.from_cddl(
cddl_res = zcbor.DataDecoder.from_cddl(
p_prelude.read_text(encoding="utf-8")
+ "\n"
+ p_corner_cases.read_text(encoding="utf-8"),
Expand Down
Loading