You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been wanting to make a PDF zoomable for my e-ink ereader, and I've found that in order to do that, I need text that wraps automatically (else I have to perpetually move the page around to read the text).
I would like to retain font sizes, as export to ODT does (but to HTML does not, for some odd reason, even though the data is in it...?), so I can differentiate titles from paragraphs. With HTML, reflowing works...
For the reflowing to work in ODT, paragraphs (that are recognized) may not contain any hard line breaks.
Could you please add an option to remove those from recognized paragraphs?
And add an option to insert/keep hard linebreaks within a paragraph when the length of the line is less than x percent of the paragraph width? Those are usually lines where it makes sense to have that hard break.
And / or add an option to apply recognized styling to text in HTML? It's frustrating to have that sit in the title attribute, but not being used... Or am I misunderstanding something?
The text was updated successfully, but these errors were encountered:
@manisandro Thanks, I see, it's a special kind of XHTML, and not supposed to be used in a browser, but for overlay PDFs with image / text layer. I thought it meant HTML in the save dialog. What 'hOCR' in the dropdown meant wasn't clear to me, but it provided recognition of font sizes and paragraphs, according to the available settings, and that was what I had been looking for.
Right now I bound a tiny script to a hotkey for removing single line breaks from text in clipboard. I upvote for a solution in this nice tool (at least for plain text). Could be as simple as replace "-\n" with "" and than "\n" with " ". Maybe double "\n\n" can be avoided using some regex?
Hi 👋
I've been wanting to make a PDF zoomable for my e-ink ereader, and I've found that in order to do that, I need text that wraps automatically (else I have to perpetually move the page around to read the text).
I would like to retain font sizes, as export to ODT does (but to HTML does not, for some odd reason, even though the data is in it...?), so I can differentiate titles from paragraphs. With HTML, reflowing works...
For the reflowing to work in ODT, paragraphs (that are recognized) may not contain any hard line breaks.
Could you please add an option to remove those from recognized paragraphs?
And add an option to insert/keep hard linebreaks within a paragraph when the length of the line is less than x percent of the paragraph width? Those are usually lines where it makes sense to have that hard break.
And / or add an option to apply recognized styling to text in HTML? It's frustrating to have that sit in the title attribute, but not being used... Or am I misunderstanding something?
The text was updated successfully, but these errors were encountered: