Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pdftoxml in utils.py is not portable to Windows. #50

Open
StevenMaude opened this issue May 16, 2014 · 3 comments
Open

pdftoxml in utils.py is not portable to Windows. #50

StevenMaude opened this issue May 16, 2014 · 3 comments

Comments

@StevenMaude
Copy link
Contributor

  1. The /dev/null needs to be NUL on Windows.
  2. NamedTemporaryFile behaves differently in Windows to Unix.
@scraperdragon scraperdragon changed the title pdftoxml in utils.py is not portable pdftoxml in utils.py is not portable to Windows. May 1, 2015
@aparna06
Copy link

Is this still true?

I am trying to convert pdf to xml on a Windows machine , Python 3 and I am getting an error on the "return xmldata.decode('utf-8')"

Please let me know.

@StevenMaude
Copy link
Contributor Author

This is still the case as no-one's changed the code there.

You could:

  • just run pdftohtml.exe separately and dump the results to a file (either via a script, or via Python subprocess or however you like)
  • or you can try using this as a starting point for replacing the code in this package.

@aparna06
Copy link

Thank you. I shall try both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants