Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent data truncation of unsupported data types in SDO upload #436

Open
sveinse opened this issue May 18, 2024 · 20 comments
Open

Inconsistent data truncation of unsupported data types in SDO upload #436

sveinse opened this issue May 18, 2024 · 20 comments

Comments

@sveinse
Copy link
Collaborator

sveinse commented May 18, 2024

I'm having problems reading SDO objects that are using UNSIGNED48. It used to work before PR #395 by @ljungholm - but it turns out to be coincidental.

var = self.od.get_variable(index, subindex)
if var is not None:
# Found a matching variable in OD
# If this is a data type (string, domain etc) the size is
# unknown anyway so keep the data as is
if var.data_type not in objectdictionary.DATA_TYPES:
# Get the size in bytes for this variable
var_size = len(var) // 8
if response_size is None or var_size < response_size:
# Truncate the data to specified size
data = data[0:var_size]

I believe the root cause is that canopen doesn't support the UNSIGNED48 type (which I'll contribute in another PR). The unfortunate behavior is that due to the unknown type, the data gets truncated to 1 byte. It didn't do that before #395 .

>> upload((8192, 2), {})
<-- 602     :  40 00 20 02 00 00 00 00
--> 582     :  41 00 20 02 08 00 00 00
  << read_response() = b'A\x00 \x02\x08\x00\x00\x00'
<-- 602     :  60 00 00 00 00 00 00 00
--> 582     :  00 b2 01 20 02 91 12 00
  << read_response() = b'\x00\xb2\x01 \x02\x91\x12\x00'
<-- 602     :  70 00 00 00 00 00 00 00
--> 582     :  1d 00 00 00 00 00 00 00
  << read_response() = b'\x1d\x00\x00\x00\x00\x00\x00\x00'
     data=b'\xb2\x01 \x02\x91\x12\x00\x00'
     response_size = 8
     var.datatype = 25   # UNSIGNED48 (unsupported)
     var_size = 1   # Because len(var) on unknown datatypes is 8
     data = b'\xb2'   # Truncated due to ^
<< upload() = b'\xb2'

Observations:

  • Reading SDO objects without OD record will preserve data as-is
  • Reading SDO objects with OD, but of unsupported type will return data of size 1
  • ODVariable.__len__() return 8 on unknown types, while it reports number of bits in the other types. This is why it gets truncated to one byte.

Would you agree that upload() should return the the full data for both of the first use cases? Is the fix that ODVariable.__len__() on unknown datatypes should return 64?

@sveinse
Copy link
Collaborator Author

sveinse commented May 20, 2024

Would you agree that upload() should return the the full data for both of the first use cases? Is the fix that ODVariable.__len__() on unknown datatypes should return 64?

PR #440 implements four test cases (test_unknown_*()) that address this issue. The test fails in the PR because the contains longer data and it gets truncated. IMHO canopen shouldn't truncate any messages it encounters for unknown datatypes. Especially when it is passed transparently if there is no OD entry for it.

Why do we have the logic of if var.data_type not in objectdictionary.DATA_TYPES: test and then proceed with truncating the message?

@christiansandberg
Copy link
Owner

christiansandberg commented May 20, 2024

Maybe use 64 bits instead of 8 is the easiest solution. Hopefully it shouldn't happen too often after 48 bits support is added.

@sveinse
Copy link
Collaborator Author

sveinse commented May 20, 2024

It depends on not getting a future datatype which is larger than 64 bits. If we chose to cap it at 64, I would suggest that we warn about it to the logger that the data has been capped with an unknown datatype. (And the test in PR #440 must be altered to make it pass.)

@acolomb
Copy link
Collaborator

acolomb commented May 23, 2024

I'm not really fond of replacing the fall-back 8 bits value by 64 bits.

It makes sense for any type to have at least 8 bits, since one octet is the smallest unit supported by SDO (disregarding PDO mapping with finer granularity here). Using any other fall-back value is arbitrary and will break unexpectedly when it does.

We should rather fix the min(OD, SDO) length assumption to only get applied when the OD size is actually meaningful and not itself a fall-back value.

@sveinse
Copy link
Collaborator Author

sveinse commented May 23, 2024

We should rather fix the min(OD, SDO) length assumption to only get applied when the OD size is actually meaningful and not itself a fall-back value.

I think this makes sense. Since the data type is of unknown type, no assumptions about the data contents can be made. So it should be passed transparently without truncation. -- Especially since it would pass transparently if the object have no OD entry.

@christiansandberg
Copy link
Owner

I agree, better to check if size is known or not.

@sveinse
Copy link
Collaborator Author

sveinse commented May 23, 2024

I attempted

var = self.od.get_variable(index, subindex)
if var is not None:
    if var.data_type not in objectdictionary.DATA_TYPES:
        # Use max to determine enough space for the unknown data type
       var_size = max(len(var) // 8, response_size)
        if response_size is None or var_size < response_size:
            # This wil never happen...
            data = data[0:var_size]
return data

But it turned out a bit non-sensical. So I'm kind of back at: Why do we even truncate? Why do we want to reduce the data for something we don't know what is?

@acolomb
Copy link
Collaborator

acolomb commented May 23, 2024

Just a quick note, the truncating was introduced to avoid an exception from the struct methods. They are quite strict about the buffer length, but could be tricked otherwise by slicing.

Will need to take another good look at the alternatives and code flow...

@sveinse
Copy link
Collaborator Author

sveinse commented May 23, 2024

I would assume that known data types can or should be truncated in case of excessive data. Here we're talking about unknown data types with no reference to validate its proper size.

@acolomb
Copy link
Collaborator

acolomb commented May 27, 2024

I'm really still uncertain what the actual problem here is. SdoClient.upload() is actually independent from the ODVariable. It used to (before #395) only consult the OD when the uploaded data was of unspecified size. #395 changed that to apply even when an upload size was indicated, which I think is not always correct. For broken devices that send e.g. an UNSIGNED8 as UNSIGNED32 with four data bytes, it does make sense to truncate the data instead of failing during unpack(). But for any type not based on struct.Struct, truncating is usually harmful.

The real culprit here is IMHO the check var.data_type not in objectdictionary.DATA_TYPES. That applies for any unknown OD entry type as well, but should actually be limited to types where the length is known in advance. So better to check for var.data_type in var.STRUCT_TYPES here, assuming that all known-length data types will be unpacked using structs. Even better to encapsulate this into a new property ODVariable.length_known.

And maybe this whole check should be moved to SdoVariable.get_data(), where we are certain that an associated ODVariable is actually available? For the "OD-independent" (as stated by the docstring) .upload() method, consulting the OD is a bit inconsistent.

@sveinse
Copy link
Collaborator Author

sveinse commented May 27, 2024

I think we have two independent factors or conditions for any object ready by SDO upload:

  • Is the var using a known data type
  • Is there a OD entry for the object

(The PR #440 contains unit tests for both of these cases)

Known Data type Have OD entry Result data
Yes Yes As-is
Yes No As-is
No No As-is
No Yes Data truncated

So I think the summary of the problem statement is that the current implementation truncates the data in the latter case.

@acolomb
Copy link
Collaborator

acolomb commented May 27, 2024

I think the problem is to more clearly define what is a "known" data type. The check against objectdictionary.DATA_TYPES actually just matches (VISIBLE_STRING, OCTET_STRING, UNICODE_STRING, DOMAIN). Those are the defined ones which are not parsed by a struct, and the check also matches undefined data type entries. We should invert that and apply the truncation only for known data types where a struct gets used for parsing, I think.

Also, your second case is impossible, as the data type is defined in the OD entry, thus without one we cannot know.

@sveinse
Copy link
Collaborator Author

sveinse commented May 27, 2024

I think the problem is to more clearly define what is a "known" data type. The check against objectdictionary.DATA_TYPES actually just matches (VISIBLE_STRING, OCTET_STRING, UNICODE_STRING, DOMAIN). Those are the defined ones which are not parsed by a struct, and the check also matches undefined data type entries. We should invert that and apply the truncation only for known data types where a struct gets used for parsing, I think.

Yes, I think that is a good idea. We must except array-types from this truncation, being (VISIBLE_STRING, OCTET_STRING, UNICODE_STRING, DOMAIN).

Also, your second case is impossible, as the data type is defined in the OD entry, thus without one we cannot know.

Yes, you're right. Without OD one cannot know the data type. I missed the fact. It is possible thou to read an object with SDO that is missing from the OD and thus its data type is missing. Its data should definitely not be truncated.

@acolomb
Copy link
Collaborator

acolomb commented May 27, 2024

Right, so to summarize: We only truncate if the OD contains an entry AND its data type is contained in STRUCT_TYPES. That automatically leaves array types (DOMAIN etc.) untouched, and will not explode for yet undefined type constants.

On to the other questions:

  • Should the truncation happen always in SdoClient.upload(), or do we move it to SdoVariable.get_data(), which will only ever be called with a valid ODVariable associated? IMHO if upload() states that it works "without an Object Dictionary", then looking up anything from the OD in that function will be a surprise.
  • I suggested to encapsulate the check into some ODVariable.length_known property for better readability and separation of concerns.

@sveinse
Copy link
Collaborator Author

sveinse commented May 27, 2024

It must look up the OD to get the data type, so in order for SdoClient.upload() to do any length validation, it must use the OD.

IMHO SdoClient.upload() is just a raw producer of bytes using an SDO upload request. The deserialization step into native python objects happens later. So I think I lean towards not doing any validation in SdoClient.upload().

Today the deserialization happens in ODVariable.decode_raw() with its call from the Variable.raw property. This is where the stream from the SDO upload should be validated, I think.

EDIT I vote for removing any validation from SdoClient.upload() and add it to ODVariable.decode_raw(). The validation should only be made on known data types (that is types this library can decode) with the exception of the known array formats.

PPS! Should we validate, e.g. by warning or error, or should we truncate (and warn)?

@acolomb
Copy link
Collaborator

acolomb commented Jun 3, 2024

The only reason for doing validation in .upload() is that we know the indicated data set size there. The check prior to #395 actually only kicked in when that size was None, i.e. the server didn't specify how many data bytes should be interpreted.

When moving the validation to decode_raw() completely, we lose that bit of information. But frankly, I'm not sure we really need it at all. Trying to summarize it into a table again (may need to scroll horizontally):

No. Type in OD SDO Size Indication
(response_size)
SDO Payload
(len(data))
Action in
decode_raw()
Use Case
1 in var.STRUCT_TYPES unspecified == len(var) OK, unpack normal
2 in var.STRUCT_TYPES unspecified < len(var) pad zeros right, unpack partial (compressed?) response
3 in var.STRUCT_TYPES unspecified > len(var) truncate right, unpack normal, depends on OD
4 in var.STRUCT_TYPES == len(var) (don’t care) OK, unpack normal
5 in var.STRUCT_TYPES < len(var) (don’t care) pad zeros right, unpack partial (compressed?) response
6 in var.STRUCT_TYPES > len(var) (don’t care) truncate right, unpack quirk, always 4-byte for small vars
7 in object_dictionary.DATA_TYPES unspecified use as length bytes(data) normal, streaming until complete
8 in object_dictionary.DATA_TYPES specified (don’t care) bytes(data[:response_size]) normal, discarding extra (unused) bytes
9 neither STRUCT nor DATA unspecified use as length bytes(data) unknown data type
10 neither STRUCT nor DATA specified (don’t care) bytes(data[:response_size]) unknown data type, discarding extra bytes
11 no OD entry unspecified use as length bytes(data) unknown data type
12 no OD entry specified (don’t care) bytes(data[:response_size]) unknown data type, discarding extra bytes

Notes:

  • (1) to (3) and (4) to (6) are equivalent here, which is why we don't really need to distinguish the response_size being specified while decoding.
  • (2) and (5) are mostly useful in PDO context, where a numeric variable can be mapped partially (e.g. UNSIGNED32 object with small values mapping only 16 bits to PDO). We might want to differentiate this between PDO and SDO then?
  • (6) is essentially what Use smaller of response size and size in OD #395 tried to fix. Again, by handling the quirky responses in this way we make it equivalent to (3).
  • (7) to (12) all have in common that they are not in var.STRUCT_TYPES (also if var is None), thus this is the only check we would really need. not in objectdictionary.DATA_TYPES (as currently coded) however doesn't catch all cases, especially unknown bit widths, that's why new tests fail without Implement the remaining canopen datatypes #440.

Does that all seem correct? Would it be okay for you if we used this table as basis for the implementation and tested each case?

@sveinse
Copy link
Collaborator Author

sveinse commented Jun 4, 2024

This is a really good overview. There are two independent usages of length involved here

  • Declared SDO upload size (response_size) vs. actual data (len(data)). If these don't match, I am inclined to consider this a protocol violation. (I speculate if the SDO abort code 0x05040002 is intended for such size mismatches in block mode).

  • Actual data size (len(data)) vs. expected OD size (len(var)). If there is too little data to decode the format, then, well, there is not enough information and this should be an exception. If there is excessive data, it can be truncated. But only if the data type is known. For anything else, the data should be passed transparently through.

@acolomb
Copy link
Collaborator

acolomb commented Jun 10, 2024

I'd really like to get this one fixed soon, but right now we only have ideas and not a complete and well-tested implementation draft of these last suggestions. Therefore I think we should focus on getting a release out the door to avoid problems like #455 and come back to this afterwards.

OK for you @sveinse?

acolomb added a commit to sveinse/canopen-asyncio that referenced this issue Jun 11, 2024
@sveinse
Copy link
Collaborator Author

sveinse commented Jun 11, 2024

@acolomb Yes, let's not delay the release because of this. Current behavior is what it is and we're not introducing any regressions by not fixing this issue.

acolomb added a commit that referenced this issue Jun 11, 2024
* Implement the remaining canopen datatypes
* Add datatype defs for the canopen standard types
* Move datatypes_24.py classes to datatypes.py
* Replace Unsigned24 and Interger24 by generic UnsignedN and IntegerN respectively
* Add EDS-file containing all datatypes
* Added tests for encoding and decoding all datatypes
* Added tests for SDO uploads of all datatypes
* Annotate type hint for STRUCT_TYPES listing.
* Disable failing tests waiting on a fix for #436.

Co-authored-by: André Colomb <[email protected]>
@sveinse
Copy link
Collaborator Author

sveinse commented Jun 16, 2024

@friederschueler not sure I'd label this one as an enhancement. Its describing unexpected behavior, so it's a bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants