Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: expected string or buffer when .doc is converted to .docx with MS Office in Windows #219

Open
rejuashes opened this issue Jun 21, 2016 · 2 comments

Comments

@rejuashes
Copy link

rejuashes commented Jun 21, 2016

pydocx_html_windows_error.txt
Hi Guys,

I am facing a situation where pydocx.to_html behaves indifferently on a same .doc file which is converted to a .docx file.

Scenario 1 : .doc file is converted to .docx file using libreoffice in linux(saving as Microsoft word 2007/2010/2013 XML) - works fine.

Scenario 2 : .doc file is converted to .docx file using MS Office in windows - throws an error.

return re.match('^\s_([^\s]+)\s_(.*)$', self.instr)
File "/usr/lib/python2.7/re.py", line 137, in match
return _compile(pattern, flags).match(string)
TypeError: expected string or buffer

Any pointers would be helpful.

regards,

Rajith

@kylegibson
Copy link
Contributor

Hi,

Thanks for the issue report! Could you attach the .doc converted to .docx using MS Office in windows that is throwing the error?

Thanks,

-Kyle

@rejuashes
Copy link
Author

Hi Kyle,

Attaching the original source .doc file which was converted to .docx.

regards,

rajith

ABC.zip

@winhamwr winhamwr changed the title pydocx docx to html conversion error TypeError: expected string or buffer when .doc is converted to .docx with MS Office in Windows Jul 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants