Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Page splitting: revamp, fix some issues #443

Merged
merged 11 commits into from
Jul 10, 2021

Conversation

poire-z
Copy link
Contributor

@poire-z poire-z commented Jul 9, 2021

(Upstream) lvstring: fix signed integer overflow
(Upstream) lvopc: fix memory leak
(Upstream) ldomDocument(): fix memory leak
(Upstream) lvtinydom: fix memory leaks

From buggins/coolreader#299

LVArray: fix indexOf() to work with any type

Fix oversight in our decb46c / buggins/coolreader@a112014#diff-f37b0ebc86a9c3391b92d47a8dce2d3cdb5cc270e2dcf07b4ef73af5e1d4af6f I guess,
(Pinging @virxkane for info).

LVDocView: fix m_twoVisiblePagesAsOnePageNumber uninitialised

CSS: line-break: accept "-cr-loose" to ignore no-break-spaces

Can be handy to counteract #423 (comment).

CSS: font-variant: extend effect of "normal" and "none"

See, and should allow closing koreader/koreader#7913.

lvrend: fix BlockFloat footnotes context line associations

Footnote links inside floats were previously associated to the line before the float, which could make them shown on the previous page if the float needs a page break to be shown without break.
We now delay these links associations, and associate them only when the float is fully passed by, so the footnote is never before the float.

In-page footnotes: support links inside inline-block

Was too lazy to handle it at the time, turned out it wasn't that complicated :)

Page splitting: revamp, fix some issues

Rewritten page splitter, which should be easier to understand and tweak.
The old code was quite tedious, trying to do everything in one single pass.
This new code first finds out group of consecutive lines that should be kept together (break avoid), and process each group when they end, and only then the footnotes they contain (with delay if they don't fit).
This fixes 2 small issues (see koreader/koreader#5031 (comment)):

  • footnotes could slice a paragraph with floats on the line the footnotes happen; now, they are delayed till the end of the block, and the full block can go to the next page (with its footnotes) if it is taller than the current page.
  • if a first footnote and its separator didn't fit on a page, a new page was made so it can be added, leaving some blank at bottom of previous page ; now, we delay footnotes in this case, allowing an additional line of the main text to be added if it fits.

This change is Reviewable

virxkane and others added 11 commits July 9, 2021 14:06
Fix lvstring.cpp:127:42: runtime error:
signed integer overflow: 1775760480 * 31 cannot
be represented in type 'int'
Direct leak of 64 byte(s) in 4 object(s) allocated from:
  00 in operator new(unsigned long) (libasan.so.6+0xb0087)
  01 in OpcPart::readRelations() lvopc.cpp:73
  02 in OpcPart::getRelatedPartName(char32_t const*, lString32) lvopc.cpp:33
  03 in fb3ImportContext::openBook() fb3fmt.cpp:142
Indirect leak of 512 byte(s) in 4 object(s) allocated from:
  00 in operator new[](unsigned long) (libasan.so.6+0xb01f7)
  01 in LVHashTable<lString32, lString32>::LVHashTable(int) lvhashtable.h:107
  02 in OpcPart::readRelations() lvopc.cpp:73
  03 in OpcPart::getRelatedPartName(char32_t const*, lString32) lvopc.cpp:33
  04 in fb3ImportContext::openBook() fb3fmt.cpp:142
Indirect leak of 96 byte(s) in 4 object(s) allocated from:
  00 in operator new(unsigned long) (libasan.so.6+0xb0087)
  01 in LVHashTable<lString32, lString32>::set(lString32 const&, lString32) vhashtable.h:174
  02 in OpcPart::readRelations() lvopc.cpp:77
  03 in OpcPart::getRelatedPartName(char32_t const*, lString32) lvopc.cpp:33
  04 in fb3ImportContext::openBook() fb3fmt.cpp:142
Direct leak of 64 byte(s) in 1 object(s) allocated from:
 00 in operator new(unsigned long) (libasan.so.6+0xb0087)
 01 in tinyNodeCollection::allocTinyElement(ldomNode*, unsigned short, unsigned short) lvtinydom.cpp:16430
 02 in ldomDocument::ldomDocument() lvtinydom.cpp:4136
 03 in LVDocView::createEmptyDocument() lvdocview.cpp:4797
 04 in LVDocView::loadDocumentInt(LVFastRef<LVStream>, bool) lvdocview.cpp:4438
 05 in LVDocView::LoadDocument(char32_t const*, bool) lvdocview.cpp:4127
Direct leak of 128 byte(s) in 2 object(s) allocated from:
  0 in operator new(unsigned long) (libasan.so.6+0xb0087)
  1 in ldomNode::modify() lvtinydom.cpp:19148
  2 in ldomNode::removeChild(unsigned int) lvtinydom.cpp:18643
  3 in ldomNode::removeChildren(int, int) lvtinydom.cpp:5845
  4 in ldomNode::autoboxChildren(int, int, bool) lvtinydom.cpp:5995
  5 in ldomNode::initNodeRendMethod() lvtinydom.cpp:6842
  6 in ldomElementWriter::onBodyExit() lvtinydom.cpp:7815
  7 in ldomElementWriter::~ldomElementWriter() lvtinydom.cpp:7955
  8 in ldomDocumentWriterFilter::popUpTo(ldomElementWriter*, unsigned short, int) lvtinydom.cpp:13895
  9 in ldomDocumentWriterFilter::AutoOpenClosePop(int, unsigned short) lvtinydom.cpp:14126
  10 in ldomDocumentWriterFilter::OnTagClose(char32_t const*, char32_t const*, bool) lvtinydom.cpp:14657
  11 in LVXMLParser::Parse() lvxmlparser.cpp:278
Direct leak of 64 byte(s) in 1 object(s) allocated from:
  0 in operator new(unsigned long) (libasan.so.6+0xb0087)
  1 in ldomNode::modify() lvtinydom.cpp:19148
  2 in ldomNode::insertChildElement(unsigned int, unsigned short, unsigned short) lvtinydom.cpp:18534
  3 in ldomNode::autoboxChildren(int, int, bool) lvtinydom.cpp:5974
  4 in ldomNode::initNodeRendMethod() lvtinydom.cpp:6842
  5 in ldomElementWriter::onBodyExit() lvtinydom.cpp:7815
  6 in ldomElementWriter::~ldomElementWriter() lvtinydom.cpp:7955
  7 in ldomDocumentWriterFilter::popUpTo(ldomElementWriter*, unsigned short, int) lvtinydom.cpp:13895
  8 in ldomDocumentWriterFilter::AutoOpenClosePop(int, unsigned short) lvtinydom.cpp:14126
  9 in ldomDocumentWriterFilter::OnTagClose(char32_t const*, char32_t const*, bool) lvtinydom.cpp:14657
Direct leak of 64 byte(s) in 1 object(s) allocated from:
  0 in operator new(unsigned long) (libasan.so.6+0xb0087)
  1 in tinyNodeCollection::allocTinyElement(ldomNode*, unsigned short, unsigned short) lvtinydom.cpp:16424
  2 in ldomNode::insertChildElement(unsigned int, unsigned short, unsigned short) lvtinydom.cpp:18538
  3 in ldomNode::autoboxChildren(int, int, bool) lvtinydom.cpp:5974
  4 in ldomNode::initNodeRendMethod() lvtinydom.cpp:6842
  5 in ldomElementWriter::onBodyExit() lvtinydom.cpp:7815
  6 in ldomElementWriter::~ldomElementWriter() lvtinydom.cpp:7955
  7 in ldomDocumentWriterFilter::popUpTo(ldomElementWriter*, unsigned short, int) lvtinydom.cpp:13895
  8 in ldomDocumentWriterFilter::AutoOpenClosePop(int, unsigned short) lvtinydom.cpp:14126
  9 in ldomDocumentWriterFilter::OnTagClose(char32_t const*, char32_t const*, bool) lvtinydom.cpp:14657
  10 in LVXMLParser::Parse() lvxmlparser.cpp:278
  11 in LVHTMLParser::Parse() lvhtmlparser.cpp:157
Direct leak of 144 byte(s) in 2 object(s) allocated from:
  0 in operator new(unsigned long) (libasan.so.6+0xb0087)
  1 in CCRTable::LookupElem(ldomNode*, int, int) (cr3qt/cr3+0x1f6681a)
  2 in CCRTable::LookupElem(ldomNode*, int, int) (cr3qt/cr3+0x1f65cb3)
  3 in CCRTable::CCRTable(ldomNode*, int, bool, int, int, bool, bool, int, bool) lvrend.cpp:2243
  4 in renderTable(LVRendPageContext&, ldomNode*, int, int, int, bool, int, int&, int, bool, bool, bool) lvrend.cpp:2276
  5 in renderBlockElementEnhanced(FlowState*, ldomNode*, int, int, unsigned int) lvrend.cpp:7639
  6 in renderBlockElementEnhanced(FlowState*, ldomNode*, int, int, unsigned int) lvrend.cpp:7860
  7 in renderBlockElement(LVRendPageContext&, ldomNode*, int, int, int, int, int, int, int*, unsigned int) lvrend.cpp:8393
  8 in renderBlockElement(LVRendPageContext&, ldomNode*, int, int, int, int, int, int, int*) lvrend.cpp:8409
  9 in LVFormatter::measureText() lvtextfm.cpp:2045
  10 in LVFormatter::processParagraph(int, int, bool) lvtextfm.cpp:3809
  11 in LVFormatter::splitParagraphs() lvtextfm.cpp:4461
  12 in LVFormatter::format() lvtextfm.cpp:4508
  13 in LFormattedText::Format(unsigned short, unsigned short, int, int, int, bool, BlockFloatFootprint*) lvtextfm.cpp:4612
  14 in ldomNode::renderFinalBlock(LVRef<LFormattedText>&, RenderRectAccessor*, int, BlockFloatFootprint*) lvtinydom.cpp:19074
  15 in renderBlockElementEnhanced(FlowState*, ldomNode*, int, int, unsigned int) lvrend.cpp:8167
Indirect leak of 896 byte(s) in 28 object(s) allocated from:
  0 in __interceptor_realloc (libasan.so.6+0xaea88)
  1 in lxmlAttribute* cr_realloc<lxmlAttribute>(lxmlAttribute*, unsigned long) (cr3qt/cr3+0x1c890d6)
  2 in ldomAttributeCollection::set(unsigned short, unsigned short, unsigned int) lvtinydom.cpp:3933
  3 in ldomNode::setAttributeValue(unsigned short, unsigned short, char32_t const*) lvtinydom.cpp:17001
  4 in fixupMathML mathml.cpp:2157
  5 in fixupMathMLRecursive mathml.cpp:1737
  6 in fixupMathMLRecursive mathml.cpp:1734
  14 in fixupMathMLRecursive mathml.cpp:1734
  15 in fixupMathMLMathElement(ldomNode*) mathml.cpp:1708
  16 in ldomNode::initNodeRendMethod() lvtinydom.cpp:7777
  17 in ldomElementWriter::onBodyExit() lvtinydom.cpp:7819
  18 in ldomElementWriter::~ldomElementWriter() lvtinydom.cpp:7959
Footnote links inside floats were previously associated
to the line before the float, which could make them
shown on the previous page if the float needs a page
break to be shown without break.
We now delay these links associations, and associate
them only when the float is fully passed by, so the
footnote is never before the float.
Rewritten page splitter, which should be easier to
understand and tweak.
The old code was quite tedious, trying to do everything
in one single pass.
This new code first finds out group of consecutive lines
that should be kept together (break avoid), and process
each group when they end, and only then the footnotes
they contain (with delay if they don't fit).
This fixes 2 small issues:
- footnotes could slice a paragraph with floats on the
  line the footnotes happen; now, they are delayed till
  the end of the block, and the full block can go to the
  next page (with its footnotes) if it is taller than
  the current page.
- if a first footnote and its separator didn't fit on a
  page, a new page was made so it can be added, leaving
  some blank at bottom of previous page ; now, we delay
  footnotes in this case, allowing an additional line
  of the main text to be added if it fits.
@poire-z
Copy link
Contributor Author

poire-z commented Jul 9, 2021

No more CI checks :(
The links to previous CI checks have this on their page:
Since June 15th, 2021, the building on travis-ci.org is ceased. Please use travis-ci.com from now on.

@Frenzie
Copy link
Member

Frenzie commented Jul 9, 2021

Oh right, I forgot we were still using Travis here and there. I'll look into it.

@Frenzie Frenzie mentioned this pull request Jul 9, 2021
@virxkane
Copy link
Contributor

virxkane commented Jul 9, 2021

LVArray: fix indexOf() to work with any type

Fix oversight in our decb46c / buggins/coolreader@a112014#diff-f37b0ebc86a9c3391b92d47a8dce2d3cdb5cc270e2dcf07b4ef73af5e1d4af6f

Well, yes, it would be more correct.

@poire-z
Copy link
Contributor Author

poire-z commented Jul 10, 2021

@Frenzie : do you want this PR yet around when dealing with #444 for testing ?

Also, 2021.07 is due by today - which I guess won't happen :) I'd like this PR to have at least a week of night(ly) life before it ends up in a release. So, should I bump it quick, or rather wait ?

@Frenzie
Copy link
Member

Frenzie commented Jul 10, 2021

Let's wait a week (unless you want to prep it, I can do some final editing and tagging pretty quickly most days). My cat is seriously sick. I'm in the waiting room atm with nothing to do but I didn't bring my laptop.

@poire-z
Copy link
Contributor Author

poire-z commented Jul 10, 2021

Sorry to hear about your cat :/

Let's wait a week

You mean for the release, or for merging this PR ?
If for the release, then I guess we can merge this PR. Unless you need it if you plan to fix CI (which I guess you then won't soon - and that's ok!).
So, I guess I can merge and bump this?

@Frenzie
Copy link
Member

Frenzie commented Jul 10, 2021

I mean the release, yes.

@poire-z poire-z merged commit 01a6626 into koreader:master Jul 10, 2021
@poire-z poire-z deleted the pagesplitter_revamp branch July 10, 2021 17:05
Frenzie added a commit to Frenzie/crengine that referenced this pull request Aug 26, 2021
@Frenzie Frenzie mentioned this pull request Aug 26, 2021
Frenzie added a commit that referenced this pull request Aug 26, 2021
References <#443 (comment)>.

With fetch-depth: 0 is required for comparing to origin/master

Also see actions/runner#342 for notes on $TRAVIS_COMMIT_RANGE.
@poire-z
Copy link
Contributor Author

poire-z commented Mar 16, 2024

Mentionning a minor edgy-casy issue I'll be fixing, with screenshots for reference (easier done now while I'm at it, than later when I push a PR).

If a page ends with a link to a footnote that can be made in-page, and there is not enough room, the inpage-footnote is pushed to the next page. Nothing specific was done if the next page is in another "flow" (so, probably hidden): it would go into it.
With KOReader with "hide non-linear fragment", when on that main page (2nd page here) and taping to go to the next page, we would skip that hidden-flow page (and go to the 4th page here), and not see the inpage-footnote (and I was wondering: wtf, where is it ?! until I realized there is indeed a hidden flow page (the 3rd here), and I had to use Page Browser to jump into it and see the footnote):
image

The solution I guess is to get this:
image
with:

--- a/crengine/src/lvpagesplitter.cpp
+++ b/crengine/src/lvpagesplitter.cpp
@@ -932,6 +932,22 @@ public:
                 has_footnotes = true;
             }
         }
+        // Handle a little edge case
+        if ( cur_page_nb_lines == 0 && start_if_new_page >= 0 && (cur_page_nb_footnotes_lines>0 || !delayed_footnotes.empty()) ) {
+            // Empty page, but with footnotes or delayed footnotes not yet added.
+            // These footnotes are associated to the previous page's flow.
+            // If the lines we are about to add are from a different flow, we don't want
+            // these footnotes to be with them (or they could get hidden and skipped).
+            // If that's the case, add any delayed footnotes and create a new page.
+            LVRendLineInfo * line = lines[start_if_new_page];
+            if ( line->flow != prev_page_flow ) {
+                printf("line->flow != prev_page_flow %d != %d\n", line->flow, prev_page_flow);
+                flushCurrentPage();
+                // And call us again.
+                addLinesToPage(start, end);
+                return;
+            }
+        }
         #ifdef DEBUG_PAGESPLIT
             printf("PS: adding (%d+%d) to current page %d>%d\n", lines[start]->getStart() - cur_page_bottom,
                             lines_max_bottom - lines[start]->getStart(), cur_page_top, cur_page_bottom);

Feels a bit too easy :), so this will need testing, unless I was really prophetic when I wrote in the first post:

Rewritten page splitter, which should be easier to understand and tweak.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants