Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difficult to use for binary data in py3 #17

Open
parneshraniga opened this issue Sep 4, 2019 · 5 comments
Open

Difficult to use for binary data in py3 #17

parneshraniga opened this issue Sep 4, 2019 · 5 comments

Comments

@parneshraniga
Copy link
Contributor

I have used the library with py2 and now I am moving my projects to py3. The bindings are designed to work with text data which was not a big issue with py2 where bytes and str are analogous but this fails in py3 where str is unicode.

I have hacked the code for CMS to make it work for binary data under py3 (basically assume that bytes are passed in and returned) and this can be explicitly converted to str by the user is need be. Happy to have a closer look and provide a non-hacky patch.

@vbwagner
Copy link
Owner

vbwagner commented Sep 4, 2019

Really CMS stands for "cryptrographic MESSAGE syntax" and messages typically are human readable. So, I don't think that it is good idea to interpret them as binary data by default.

So, if you are going to submit me patch, please make sure that it is able to have both str and bytes transparently for user (or may be not so transparently, i.e. using keyword args).

@parneshraniga
Copy link
Contributor Author

Thanks @vbwagner. I am trying to implement some encryption for medical imaging files (ftp://medical.nema.org/medical/dicom/final/sup55_ft.pdf). Looking at the RFC 2630, it looks the byte arrays are inputs and outputs.

In any case, I will see if I can easily modify the code to preserve currently functionality and also get bytes out.

@vbwagner
Copy link
Owner

vbwagner commented Sep 4, 2019

Of course, you are right. Binary data ought to be supported by encryption/signature formats.
But I have to think a bit how to make interface more consistent and easy to use,

I've already solved similar problems in ctypescrypto.bio and ctypescrypto.x509 modules.
And it seems that support for binary and character data is not consistent across all ctypescrypto
suibmodules.
Now BIO accepts both character and binary data, cipher and digest - require binary, and CMS - text.
Cipher returns binary data and CMS decrypt - text.

Serialization of CMS into DER format via str fuction is obviosly error in Python3. It goes unnoticed only because there is no tests for this module. DER format should be binary.

@parneshraniga
Copy link
Contributor Author

@vbwagner. Thanks for the info. I have made some minor changes whereby decrypt return bytes if Binary flag is set and string otherwise. To support this, I have added a bytes function to CMSBase. However, it might be worth being more explicit with DER format similar to how the PEM function is explicit.

@manuel-arguelles
Copy link

manuel-arguelles commented Oct 18, 2019

I can confirm this, the __str__ method of CMSBase does not works in python3 as it does in python2, the error is:

File "/usr/local/lib/python3.6/dist-packages/ctypescrypto/cms.py", line 94, in __str__ return str(bio)
File "/usr/local/lib/python3.6/dist-packages/ctypescrypto/bio.py", line 57, in __unicode__ return self.__bytes__().decode("utf-8")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x82 in position 1: invalid start byte

While on python2 works as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants