-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Constructor throw "File is not a zip file" on file created using Word #1452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm not seeing any I don't know what the type of |
@scanny , The exception is thrown at the constructor so I still don't have the chance to save the file. Just curious, if I'm just parsing why would I call save()? The other files created using python-docx work. Only those created in Microsoft Word fail. I got the stream using this function and is based on Microsoft suggestion here And lastly, the trace I hope these make things clear. Thanks for looking into this |
It looks like the problem is getting from Azure to |
They do open after saving to Azure. I'm ruling out ByteIO as the culrpirt because docx created using python-docx just works. Thank you for your time @scanny. |
My colleague figured this out. It was the cp037 encoding used by Word. |
Hi. I'm trying to parse docx using the BytesIO overload of the Document ctor. when I parse any docx created using python-docx, the codes snippet below works just fine. However, with those saved from Word, I get the "File not valid zip" error. I notice that the Word-created files are indeed larger than those created using python-docx, indicative of it not being a zip file. Is there a way to parse a non-zipped stream in this case? I'm on Python 3.11
The text was updated successfully, but these errors were encountered: