-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make python based exportNotebook script (also add support for mac and potentially windows) #22
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Marcello Seri <[email protected]>
Signed-off-by: Marcello Seri <[email protected]>
It implements a first version of core functions and drops the pdfinfo dependency but introduces the python3 PyPDF2 and paramiko dependencies. Pdftk is now optional but the merge function using PyPDF2 is not yet implemented. Signed-off-by: Marcello Seri <[email protected]>
When used, this removes the dependency of pdftk. Signed-off-by: Marcello Seri <[email protected]>
Signed-off-by: Marcello Seri <[email protected]>
Signed-off-by: Marcello Seri <[email protected]>
Signed-off-by: Marcello Seri <[email protected]>
Signed-off-by: Marcello Seri <[email protected]>
Signed-off-by: Marcello Seri <[email protected]>
Signed-off-by: Marcello Seri <[email protected]>
Signed-off-by: Marcello Seri <[email protected]>
I have been testing this on multiple PDFs, I have some examples in which the offsets are coming out incorrect (happens also with the old pdftk-based implementation though). Unfortunately I cannot publish the PDFs where this is happening. Interestingly enough the issue does not show up with remt, so there may be some incorrect computation here (I wonder if it has to do with the fact that we comput the x offset but we never touch the y offset). It is not a regression but I will try to make some tests to see if it can be fixed. I still believe this PR is useful as remt is problematic to use on windows due to its dependency on cairo and poppler, while this new version does not have any unusual dependency |
I have tweaked the exportNotebook script to work with US letter paper size (a scale factor and vertical offset are needed for this, whereas a horizontal offset is needed for A4 paper size). I have some pdf files that it does funny things on, so it is not quite ready to share. I'll hopefully update soon. |
@edupont is there any interest in this python conversion PR? |
@Lspencer neat! We should probably make exportNotebook generic on the page format! |
An update on my script to handle A4 and USletter paper with ExportNotebook: I was testing with a student thesis chapter and needed a vertical offset (plus a scaling factor) instead of a horizontal offset. I got this working and then tested on 2 other files. Both did not work out of the box. I found a work-around for one (us gs to scale by 1 and offset by 0, and then it would scale and offset properly), but the other one would not scale, so I had to scale the rM lines to the original page size. The latter is OK, but edge comments are clipped sometimes. I am not the author of any of these pdfs, so I do not know all of the differences, and haven't been able to duplicate these results on pdf files I create. With the pdf file that would not scale, the gs routine I was using modified the mediaBox, but not the artBox, clipBox, etc. I'll try to clean up my script and then post it, but I would like to better understand why this is failing on some pdfs. I can try to post a MWE but need these files to be my own (not student work), so I still am huntign around to recreate this problem. |
It may be useful to create a repository with a set of pairs (lines, PDF) to test, maybe even be included as a test suite ? |
@trou that would be great! |
@ericsfraga where can I find your updated script and the test file? I'm willing to play around with it |
@ericsfraga would you mind making a PR to my |
The modified python script is here along with the test file here. The script now expects a width and a height which you can extract from the PDF using pdfinfo. I have a shell script that uses rM2svg to create annotated PDF files. The shell script expects a .lines file as the argument. It then looks for a .pdf with the same base name. If it finds the file, it extracts width and height from that file and then creates an annotated file. If no PDF is found, it assumes that the .lines file refers to a notebook on the remarkable and so simply creates a PDF from the lines file directly. I've been using my shell script with the updated rM2svg script in anger for a few weeks now and it's working very well. Also, my shell script will also incorporate any template that was used for notebook documents. It does assume the templates have been copied over from the remarkable tablet itself. |
@ericsfraga I tried your rM2svg script and it worked well with my letter-format test file. I'm not sure what the purpose of your hardcoded 2.3 constant is. I would suggest a slightly simpler version that just wraps everything in a |
Ah, the 2.3 was a holdover from when I was manually playing with A4 size paper and comes from the ratio of 1404 to the width of an A4 page. It should probably be calculated directly as the 1404 divided by actual target width. This does assume that the height ratio is similar... bit of a hack, I guess. Regarding your suggestion about wrapping within an appropriate tag, I'm sure that's probably the right way to go about it but I sure do not have the expertise in SVG... In any case, I'm glad it worked with your letter format page as well! And why the default export mechanism is so bad... ummm, no idea really. |
The correct "algorithm" seems to be to scale down the page to the correct width keeping the aspect ratio, and then truncate/extend the SVG so that it also has the correct target height. I tried to do this in my code. |
The correct "algorithm" seems to be to scale down the page to the correct
width keeping the aspect ratio, and then truncate/extend the SVG so that it
also has the correct target height. I tried to do this in my code.
…On Mon, Oct 22, 2018 at 5:26 PM Eric S Fraga ***@***.***> wrote:
Ah, the 2.3 was a holdover from when I was manually playing with A4 size
paper and comes from the ratio of 1440 to the width of an A4 page. It
should probably be calculated directly as the 1440 divided by actual target
width. This does assume that the height ratio is similar... bit of a hack,
I guess.
In any case, I'm glad it worked with your letter format page as well!
And why the default export mechanism is so bad... ummm, no idea really.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#22 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAFyY9FhRIy1gROJkh9IwNAB4ZYITOqRks5uneOugaJpZM4Un3Y_>
.
|
Signed-off-by: Marcello Seri <[email protected]>
This should resolve the scaling issues for the PDFs that have all the pages of the same size Signed-off-by: Marcello Seri <[email protected]>
@mseri Not sure if mixing this with the prepare_background functionality you have makes sense. In fact the '{translate -offset 0}' ghostscript command does not work at all for the documents I was trying to export, hence I switched over to generating the appropriate foreground image in the first place. |
I like the idea of generating the "right" foreground so that there are as few rescaling as possible in the document merge. It does not seem that this will be merged anytime soon, I'll play around with it and see if it resolves some of the issues I've found |
This has gotten stale. I am starting working on the version supporting the new format, integrating @ericsfraga changes as well. Would it make sense to just fork this. It seems to have become unmaintained |
I cannot speak for the original author but do let us know if you do fork this. I am waiting impatiently for the latest upgrade to the system to see if things break again; I don't expect they will but you never know! |
I will update the code to use the new rm2svg for .rm files, generate the appropriate foreground, and move everything to a separate repo. My hopes are for all this functionality to be integrated directly in rmapi (as it seems to be happening), but I want to have working colorful export for myself in the meanwhile. I'll ping here in some weeks, as soon as I have done it |
This allows to drop the dependency on
pdfinfo
and makes bothpdftk
andghostscript
optional, at the price of introducing a dependency onPyPDF2
(completely python based) andparamiko
(almost purely python based).I also believe that the python3-based script is easier to read and maintain. I think it could even allow to run this script on Windows, but I have not tested it.
If you are happy with this PR and such a drastic change, I can update the README.
We could also add a
setup.py
file with an entrypoint forexportNotebook
andrM2svg
, and package the whole thing as a standalone pip-installablepython3 package
. I am happy to add it if you believe this is worthwhile.There is space for further cleanup and unit-testing. Currently I have tested this on multiple notebooks using the latest osx and ununtu, and I am very happy with the results.
Closes #21