-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is is possible to convert to SVG but keep text as text? #17
Comments
I thing "pdf2svg" is not able to do anything about that, it depends of Poppler or Cairo library |
@RonanKER ,do you hava any code or configuration to show it ? |
If you want to keep text in the SVG then your best bet is to use Inkscape. I'm fairly sure it can be used from the command line to automate the conversion with text (though I've never used it for automated PDF -> SVG, only manually). Be aware that text often moves around a bit (the kerning is often a little off) when converting from a PDF. |
See https://inkscape.org/doc/inkscape-man.html for details on the Inkscape command line. |
I have learned to use Inkscape for a week. as i know Inkscape can just convert pdf to svg for the first page.is this real? |
It can open any page when opening with the gui. If you want everything via the command line, you can simply use qpdf or pdftk to extract the page you want from the PDF as a single page and then use Inkscape. (Inkscape might be able to do page selection from the command line, I just don't know how.) |
I got an old batch script from 2015 when I tryed it (with pdftk and inkscape) : in the folder 'in' I put several pdf exemple/test files, and then i lunched several similar batch files to try several solutions (inkscape, pdf2svg, pdftron, poppler, ...) and then compare results. If you can afford it, i think pdftron was the best, but i'm not sure it would preserve text as you wich. |
could anyone hint me in the right direction to understand why neither cairo nor poppler preserve text during pdf to svg conversion (to find some workaround to force them to keep it)? Does this procedure have a name? Is it "text vectorization" by any chance? By the way I've tried inkscape as well, but no luck. Libreoffice seemed to work, but it was extremely slow and created a large .svg file, which is very hard to open. |
I'm not sure what the name is ("preserve text" would have been my guess). Inkscape is usually the best in recent years - I've not had any problems with the PDFs that I've given it recently. It might be worth running pdftotext on your PDF to see if it does actually contain any text. |
After some research on PDFs in general I've realized that the problem was in the text being not a "regular text", but as part of "annotaton/comments" objects. These often get ignored when being imported and I believe that inkscape excluded them as well. |
Similarly, and somewhat related, I believe it would be practical to preserve the hyperlinks—ideally embedded in the text itself, or at least surrounding the glyph paths. |
Is is possible to convert to SVG but keep text as text?
The text was updated successfully, but these errors were encountered: