Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalized xpointers fix, some FB2 footnotes tweaks #329

Merged
merged 4 commits into from
Feb 26, 2020

Conversation

poire-z
Copy link
Contributor

@poire-z poire-z commented Feb 26, 2020

Fix issue with toStringV2() when the node is itself a boxing node. See #328 (comment).

Also a few commits related to FB2 footnotes (that I don't use, I just have 1 single FB2 file with footnotes...), which may help the user that posted at https://www.mobileread.com/forums/showthread.php?p=3956818#post3956818 on his 2 issues:

FB2 footnotes: fix spurious newline ("display: run-in")
I might have discovered why there was this strange support for that obscure/obsolete display: run-in:
https://css-tricks.com/run-in/ https://developer.mozilla.org/en-US/docs/Web/CSS/display
It might have been seen as a good and not-hackish solution to solve the fact that FB2 footnotes have a strange structure, and appears in the HTML as:

   <body>
    <section>
      <p>Text before<a l:href="#n1" type="note">[1]</a></p>
    </section>
    [...]
   </body>
   <body name="notes">
    <title><p>blah</p></title>
    <section id="n1">
      <title>
        <p>1</p>
      </title>
      <p>Text footnote</p>
    </section>
    <section id="n3">
      <title>
        <p>3</p>
      </title>
      <p>Table fntext2</p>
      <p>Table fntext2 2nd para</p>
      <p>Table fntext2 3nd para</p>
    </section>
  </body>
</FictionBook>

and so the footnote number and the footnote text being each in a block element, which would be rendered as:
image
By setting the title to be display: run-in, it merges as inline in the following block element:
image

I'm not sure support for display: run-in is really complete as per-specs (it should become block in many cases), but I guess its implementation in crengine was enough to solve this footnote rendering problem.
But it didn't work on the above sample - but it did if I removed all newlines and spaces...
I don't know if I broke that with my other changes over the last year - but I just tried to fix this case :) by just skipping the empty text node if there is one in between.

(I thought about just dropping support for "run-in", and use section > title { float: left; } - but there are so many section & title (chapter names) in FB2, with their own styling (centering) in fb2.css - that it would have need quite many fixes in there ... and I don't really much care about restyling FB2...)

FB2 footnotes: disable hardcoded handling as in-page
FB2 footnotes: allow them to be shown in popup
will allow FB2 footnotes to be shown as popups, or enabled back as in-page via a style tweak, just like EPUBs foonotes. For consistency.


For this FB2 user's other question at https://www.mobileread.com/forums/showthread.php?t=327670, we have in fb2.css:

cite p, epigraph p { text-align: left; text-indent: 0px }
cite { display: block; font-style: italic; margin-left: 5%; margin-right: 5%; text-align: justify; margin-top: 20px; margin-bottom: 20px }

so, I guess a style tweak with one of these (not sure what's wished) would work:
cite { display: inline !important }
or

cite p, epigraph p { text-indent: 1.2em !important; }
cite { margin-top: 0 !important; margin-bottom: 0 !important; }

This change is Reviewable

When the ldomXPointer's node is itself a boxing node,
avoids getting "div[1]/div[2]/inlineBox.0", that
createXPointerV2() would stumble on, and get
"div[1]/div[2]/div[1]" instead (the boxing node
child, or its parent if no children).
No impact whether we set css_ff_serif or css_ff_sans_serif,
as every font is flagged as being sans-serif - but it
avoids wondering why this is different when debugging.
"display: run-in" (not supported by all browsers and possibly
deprecated) support might have been added only as a way to solve
a rendering issue with FB2 footnotes, where the footnote number
is in a block element, and the footnote text in another block
element. fb2.css sets the first block to be "display: run-in"
as an attempt to bring them both in an erm_final element so
they can be rendered on the same line.
This fixes the fact this was not happening when there is
a text node (a single "\n" in the HTML) between these two
nodes.
Copy link
Member

@Frenzie Frenzie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r1.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @poire-z)


crengine/src/lvstsheet.cpp, line 1486 at r1 (raw file):

                        strValue = joinPropertyValueList( list );
                    }
                    // default to sans-serif generic font-family (the default

So this has no actual impact?

@poire-z
Copy link
Contributor Author

poire-z commented Feb 26, 2020

Yes, no impact: we'll be comparing only fonts with css_ff_sans_serif with what's given here. So, either they all match or they all do not match: that will contribute the same way to the scoring.
It's just that I'll stop worrying: why do I see family: 1 here, while I see family: 2 everywhere :)

I'll probably drop the last commit, and do that in cre.cpp: otherwise, we can't have both popup footnotes and in-page footnotes at the same time - which we can with EPUBs.

@poire-z
Copy link
Contributor Author

poire-z commented Feb 26, 2020

(@hius07: that 2nd styletweak suggestion test_fb2_cite.css.txt works (indent, no top/bottom margin) on your sample FB2:)
image
(and the footnote will display in some days when this PR is merged and reaches frontend.)

@poire-z
Copy link
Contributor Author

poire-z commented Feb 26, 2020

OK, dropped the last commit: will be handled in cre.cpp.

@poire-z
Copy link
Contributor Author

poire-z commented Mar 14, 2020

@hius07 : can you register on github, so we can discuss and work out your FB2 issues, and keep a trace of all that?

Regarding https://www.mobileread.com/forums/showthread.php?t=328053:

fb2.css only has that for name="notes":

body[name="notes"] { font-size: 70%; }
body[name="notes"] section title { display: run-in; text-align: left; font-size: 110%; font-weight: bold; page-break-before: auto; page-break-inside: auto; page-break-after: auto; }
body[name="notes"] section title p { display: inline }

Duplicate them with name="comments", and it should solve your issue.

body[name="comments"] { font-size: 70%; }
body[name="comments"] section title { display: run-in; text-align: left; font-size: 110%; font-weight: bold; page-break-before: auto; page-break-inside: auto; page-break-after: auto; }
body[name="comments"] section title p { display: inline }

But I'm not sure what the differences between footnotes in "notes" and those in "comments" are...
image
In blue, one from "notes". In red, one from "comments".

My question is whether they should both have the same styles (font size), or if there is some kind of hierarchy between them, one deserving to be different than the other.

pinging @pkb and @virxkane also, as I guess you know more about FB2 than any of us here :)
@virxkane: since you picked my latest commits, you won't have in-page FB2 footnotes anymore, until you add into your fb2.css/fb3.css:

body[name="notes"] section,
body[name="comments"] section
{
    -cr-hint: footnote-inpage;
    /* margin: 0 !important; this one is up to you */
}

edit: https://sudonull.com/post/16143-Electronic-books-and-their-formats-FB2-and-FB3-history-pros-cons-and-principles-of-work a little history of that FB2 format (in english)

@Frenzie
Copy link
Member

Frenzie commented Mar 14, 2020

@poire-z Looks like they're supposed to be endnotes:

http://www.fictionbook.org/index.php/%D0%AD%D0%BB%D0%B5%D0%BC%D0%B5%D0%BD%D1%82_body

При наличии в тексте сносок () сам текст сносок должен размещаться во втором с атрибутом name="notes".
При наличии кроме сносок также комментариев и т.п., что в оригинале размещено в конце книги - создается третье с атрибутом name="comments" или без атрибута name.

DeepL translated:

If there are footnotes in the text () the text of the footnote itself should be placed in the second with the attribute name="notes".
If in addition to footnotes there are also comments, etc., that are placed in the original at the end of the book - a third is created with the attribute name="comments" or without the attribute name.

Sounds like an anti-feature, imho.

@pkb
Copy link
Contributor

pkb commented Mar 14, 2020

Yes, "comments" are endnotes. Users get confused often as to why some displayed in page and some not.

@poire-z
Copy link
Contributor Author

poire-z commented Mar 14, 2020

OK. So, "notes" can go in-page, but "comments" could be prefered to not go in-page.
(Then, we'd have to deal with both kinds in popup footnotes :|)

@pkb: any reason to not make them all in-page to avoid these users confusion ? :)

btw, 2e315f2 makes our popup "styles have changed... reload the doc?" appear on any typesetting change (geometry, font size...) on this FB2 book (but not others)... so, I guess I'll have to fix that before bumping my latest crengine changes :(

@pkb
Copy link
Contributor

pkb commented Mar 14, 2020

I don't have any :) but it is already long time this way, from the beginning I guess :).

@poire-z
Copy link
Contributor Author

poire-z commented Mar 14, 2020

2e315f2 makes our popup "styles have changed... reload the doc?" appear on any typesetting change

That's the footnotes :(

03/14/20-15:25:45 DEBUG CreDocument: set font hinting mode 1
setBoxingWishedButPreventedByCache autobox run-in /FictionBook/body[2]/section[1]/p.0
setBoxingWishedButPreventedByCache autoboxChildren /FictionBook/body[2]/section[1]/title.0
setBoxingWishedButPreventedByCache autobox run-in /FictionBook/body[2]/section[2]/p.0
setBoxingWishedButPreventedByCache autoboxChildren /FictionBook/body[2]/section[2]/title.0
setBoxingWishedButPreventedByCache autobox run-in /FictionBook/body[2]/section[3]/p.0
setBoxingWishedButPreventedByCache autoboxChildren /FictionBook/body[2]/section[3]/title.0
setBoxingWishedButPreventedByCache autobox run-in /FictionBook/body[2]/section[4]/p.0
setBoxingWishedButPreventedByCache autoboxChildren /FictionBook/body[2]/section[4]/title.0

@hius07
Copy link
Member

hius07 commented Mar 14, 2020

Good day guys, thank you for your response.

Referring to printed books, "notes" are footnotes and "comments" are endnotes.
In the meantime, it occurred very comfortable to have "comments" in-page, so it would be nice not to remove this possibility. To be not confused it is enough to set different styles for "notes" and "comments" (as did I).
To satisfy everybody: separate checkboxes [ ] In-page FB2 footnotes and [ ] In-page FB2 comments - would be perfect.

Regarding the issue in popup footnotes (no space after the number): adding the settings similar to the "notes" in fb2.css solves it.

The default styles for "notes" and "comments" can be the same, the user will tweak them via "User style tweaks" if needed. Or "comments" font-size can be a little bigger than the "notes".

@poire-z
Copy link
Contributor Author

poire-z commented Mar 14, 2020

@hius07 : thanks for joining in.

I just realized that even if we dont make "comments" in-page, they might need the same kind of setup in fb2.css as "notes". Otherwise, when browsing their end pages, they are displayed one-per-page (which to me feels bogus):

image

So, we should decide what is to be the default in fb2.css:

body[name="notes"], body[name="comments"] {
  font-size: 70%; /* should we hardcode them to be smaller ? or let them 100% and have that via style tweaks? */
} 
body[name="notes"] section title, body[name="comments"] section title {
    display: run-in; /* technical trick to have it inline with what's coming next */
    text-align: left; /* counteract default of title being centered (or use "start" in case there are RTL books in fb2) */
    font-size: 110%; /* should the number be bigger ? isn't bold enough ? */
    font-weight: bold;
    page-break-before: auto; /* there are to counteract the default of pb before regular <title> */
    page-break-inside: auto;
    page-break-after: auto;
} 
body[name="notes"] section title p, body[name="comments"] section title p {
    display: inline; /* counteract default of P being block */
} 

I'd let @hius07 play with that and decide what's best as the base :)

Then we could have 4 style tweaks for FB2, so probably in a submenu (because FB2 is a bit niche for most of our users):

In-page FB2 footnotes
In-page FB2 footnotes (smaller)
In-page FB2 endnotes (or "comment notes" ?)
In-page FB2 endnotes (smaller)

@hius07
Copy link
Member

hius07 commented Mar 14, 2020

I would propose by default:
font-size: 70% for "notes"
font-size: 80% for "comments"
If so, do we need a (smaller) option in style tweaks?

'endnotes' seems clearer than 'comment notes'.

No font increase for numbers, bold is enough.

@poire-z
Copy link
Contributor Author

poire-z commented Mar 14, 2020

Well, we have normal text size / smaller (80%) for EPUB and other footnotes.

It feels a bit arbitrary to set FB2 footnotes text to be forced smaller (70% or 80%), even more if we or the user choose to not have them in-page: then, following the links, why would they need to be smaller (and the user to hurt his eyes to read them) ? :)

@hius07
Copy link
Member

hius07 commented Mar 14, 2020

Agree, so 4 style tweaks is OK.

@poire-z
Copy link
Contributor Author

poire-z commented Mar 14, 2020

Thanks :)
Ok, I'll make a PR in the coming hours, with an updated fb2.css (and some fix to the above issue with the popup message, which I'm relieved was caused by some coding error of mine and is not, for now, a general issue with the idea).

@hius07
Copy link
Member

hius07 commented Mar 14, 2020

Great, thank you very much.
BTW, can we set numbers to bold in popup footnotes? Can we change font-face there?

@poire-z
Copy link
Contributor Author

poire-z commented Mar 14, 2020

As for the style tweaks, making both notes and comments in-page, should notes and comments, as they may appear mixed (as on my screenshots above), have different font sizes (or some other CSS style change, and which?) so one can distinguish between them? Could that have some sense or the reader wouldn't really bother?

@poire-z
Copy link
Contributor Author

poire-z commented Mar 14, 2020

BTW, can we set numbers to bold in popup footnotes? Can we change font-face there?

That would be in frontend/ui/widget/footnotewidget.lua (rather than hardcoded that here in the made up HTML):

/* Attempt to display FB2 footnotes as expected (as crengine does, putting
 * the footnote number on the same line as the first paragraph via its
 * support of "display: run-in" and a possibly added autoBoxing element) */
body > section > autoBoxing > *,
body > section > autoBoxing > title > *,
body > section > title,
body > section > title > p,
body > section > p {
    display: inline;
}
body > section > autoBoxing + p,
body > section > p + p {
    display: block;
}
/* may be adding this: */
body > section > autoBoxing > title,
body > section > title {
    font-weight: bold;
}

Works for me. Please have a try and tell me if it's fine (bold enough?).

@hius07
Copy link
Member

hius07 commented Mar 14, 2020

As for the style tweaks, making both notes and comments in-page, should notes and comments, as they may appear mixed (as on my screenshots above), have different font sizes (or some other CSS style change, and which?) so one can distinguish between them?

I believe they should look differently.
In the cited book (Karl Proffer on Nabokov's Lolita ) 'comments' are by the author, 'notes' are by translators.
As I wrote, as the current fb2.css has notes font-size=70%, I set comments font-size=80% in the User style tweaks.

@hius07
Copy link
Member

hius07 commented Mar 14, 2020

Works for me. Please have a try and tell me if it's fine (bold enough?).

Works for me as well and is fine.

Also I noticed there a comment regarding font-face in popup. The question is closed, thank you.

@hius07
Copy link
Member

hius07 commented Mar 14, 2020

I like the first option (as in picture).

@poire-z
Copy link
Contributor Author

poire-z commented Mar 14, 2020

I think this makes more sense:
image - image

                title = _("In-page FB2 footnotes"),
                {
                    id = "footnote-inpage_fb2";
                    title = _("In-page FB2 footnotes"),
                    description = _([[
Show FB2 footnote text at the bottom of pages that contain links to them.]]),
                    css = [[
body[name="notes"] section
{
    -cr-hint: footnote-inpage;
    margin: 0 !important;
    font-size: 70%;
}
                    ]],
                },
                {
                    id = "footnote-inpage_fb2_comments";
                    title = _("In-page FB2 endnotes"),
                    description = _([[
Show FB2 endnote text at the bottom of pages that contain links to them.]]),
                    css = [[
body[name="comments"] section
{
    -cr-hint: footnote-inpage;
    margin: 0 !important;
    font-size: 80%;
}
                    ]],
                    separator = true,
                },
                {
                    id = "fb2_footnotes_regular_font_size";
                    title = _("Keep regular font size"),
                    description = _([[
FB2 footnotes and endnote have a smaller font size by default. This allows them to be displayed with the normal font size, whether they are kept at the end of the book, or displayed in-page.]]),
                    css = [[
body[name="notes"] section,
body[name="comments"] section
{
    font-size: 100% !important;
}
                    ]],
                },

They would stay 100% when in-page are not enabled - because there's no real reason to reduce their font size when read at end of book.
As for in-page, 70% is a bit small to me (that's why we use only 80% with other footnotes :)
Would you be ok having footnotes at 80%, and endnotes at 85 or 90% ? Or FB2 readers are really fond of 70% ? :)

(the difference in phrasing/layout vs EPUB foonotes in the menu can be justified with the fact FB2 has 2 kind of footnotes :)

@hius07
Copy link
Member

hius07 commented Mar 14, 2020

For a long time we have been reading fb2-books on Kindles in CoolReader, until it was frozen a year ago (and does not work on PW4 at all).
CoolReader default fb2.css has body[name="notes"] { font-size: 70%; }, that is why many people are accustomed to it. As for me 70% is good, but I have no statistics from other users.
I proposed 70% for notes and 80% for comments, maybe 80 and 90 are good too.

@poire-z
Copy link
Contributor Author

poire-z commented Mar 14, 2020

OK, #333.

If you have time to test it: https://raw.githubusercontent.com/koreader/crengine/81f4a1d2a1e050c7adee2c5e89f17b374a60ea25/cr3gui/data/fb2.css to replace into koreader/data/.

And koreader/frontend/ui/data/css_tweaks.lua css_tweaks.lua.txt with 75%/85% (just to see, the diff is nearly unnoticable on my emulator - we'll go with 70/80% if you find that odd on your device).

@hius07
Copy link
Member

hius07 commented Mar 15, 2020

Replaced both fb2.css and css_tweaks.lua: all seems OK, 75%/85% looks good.

If unchecked In-page, both notes and comments in the end of the book are 100%, despite of unchecked Keep regular font size. Is it by design?

@poire-z
Copy link
Contributor Author

poire-z commented Mar 15, 2020

If unchecked In-page, both notes and comments in the end of the book are 100%, despite of unchecked Keep regular font size. Is it by design?

By "my" design :) My thoughts is there's no real reason to reduce their font size when they are read at end of book (except possibly aesthetics reasons) - if the user sets a good font size for reading, why arbitrary reduce it for notes? (in some paper books, end notes can be smaller - but in some other books they aren't - so, publisher choice).
Previously: "comments" were 100% (as we didn't handle them at all).
"notes" were 70%, but that was because showing them "in-page" was hardcoded, and indeed, when "in-page", it's visually better to have them smaller.
You think they should be smaller?

"Keep regular font size" is for once in-page tweaks have reduced the font-size.
Do you think we should have another tweak "smaller font size for notes when not in-page" ?

Solved your issue with your Ulysses book ?

@Frenzie
Copy link
Member

Frenzie commented Mar 15, 2020

In books I suspect it's more of a space saving measure for content many may barely read than an aesthetic choice. Especially in books that are advertised as having extensive notes I think endnotes are typically the same size.

And given that we're electronic, space is no concern. The choice is wholly aesthetic.

@poire-z
Copy link
Contributor Author

poire-z commented Mar 15, 2020

Regarding your Ulysse book, yes, may be there's something with this book and/or our style tweaks.

There are imbricated <SECTION>:
image
I dunno if this is per-FB2-specs, or if it has some sense.

I gave you:

body[name="notes"] section
{
    -cr-hint: footnote-inpage;
    margin: 0 !important;
    font-size: 75%;
}

which may indeed make the text 75% of 75%.

This might be better no it is not: the whole notes become one single footnote :(
body[name="notes"] > section with a > so it applies only on the direction SECTION child.

With this book, we should not apply to the first, only to the second. But that's hard go notice with CSS :(

@poire-z
Copy link
Contributor Author

poire-z commented Mar 15, 2020

Or may be split that tweak in 2: one for the in-page as-is, one for the font-size with >.

@poire-z
Copy link
Contributor Author

poire-z commented Mar 15, 2020

This one should work: css_tweaks.lua.txt

@hius07
Copy link
Member

hius07 commented Mar 15, 2020

I decided that Ulysses fb2 is not formatted correctly, so deleted my question (id c_1 should be comments, not notes; and nested sections in notes body is a rear thing).

Or may be split that tweak in 2: one for the in-page as-is, one for the font-size with >.

Maybe so.
I just wanted to understand the logic:
How can we get smaller notes at the end? Check In-page notes checkbox. :)

@virxkane
Copy link
Contributor

@virxkane: since you picked my latest commits, you won't have in-page FB2 footnotes anymore, until you add into your fb2.css/fb3.css:

After what commit? What have you changed that requires updating fb2.css and why?

@poire-z
Copy link
Contributor Author

poire-z commented Mar 15, 2020

1a3d236 and e13ea98
Explained in the comments there :)

Why: because we want to have them toggable from frontend - and that made some unneeded duplicated code as we can have that by just adding to fb2.css:

body[name="notes"] section    { -cr-hint: footnote-inpage; }
body[name="comments"] section { -cr-hint: footnote-inpage; }

@virxkane
Copy link
Contributor

@poire-z It not affected on desktop Qt version. Nor adding this lines to fb2.css nor reverting this commits.

@poire-z
Copy link
Contributor Author

poire-z commented Mar 15, 2020

You mean you still get "in-page" FB2 footnotes with https://github.com/virxkane/coolreader/tree/koreader-merge-post (where you have my 2 commits, that should disable any special handling of body with names="notes") ?! That's strange :|

@virxkane
Copy link
Contributor

No, I don't have "in-page" footnotes even when add corresponding lines into fb2.css.

@poire-z
Copy link
Contributor Author

poire-z commented Mar 15, 2020

And you remember had them in-page with your document previously?
Even with the one posted at https://www.mobileread.com/forums/showthread.php?t=328053 that we used above for testing?

@virxkane
Copy link
Contributor

@poire-z

And you remember had them in-page with your document previously?

I fix this... Problem found in yours strange workaround ( methods tinyNodeCollection::setBoxingWishedButPreventedByCache(), tinyNodeCollection::hasCacheFile() ) for cache files. I remove this crutch, fix tons of segfault and now it works fine. Maybe it can be usefull for you: https://github.com/virxkane/coolreader/commit/7e9ecaac3632c7c495be5f0a9ee44cfeb8779094, https://github.com/virxkane/coolreader/commit/e160b27c21635519ffd6515e7171f88772778419, https://github.com/virxkane/coolreader/commit/095f0e8ae863bf07d7d59fffc2a3c27b430ac4e6

@poire-z
Copy link
Contributor Author

poire-z commented Jun 26, 2020

All these hasCacheFile() / setBoxingWishedButPreventedByCache() were added because, since I've been working on crengine, there was the fact that:

  • we can move nodes around in the DOM only in the initial XML loading phase (when there is no cache file yet)
  • trying to do that when the DOM is loaded from a cache file always lead to a segfault

So, when we were changing styles (applying style tweaks with KOReader) on a already loaded (cache made) book, I had to prevent moving nodes around (all this boxing business), and let that fact known to our frontend code (setBoxingWishedButPreventedByCache()) so it can warn the user and suggest him to reload the book from scratch (invalidating the cache, reparsing the whole book XHTML).

So, you mean you fixed that ?!! Wow ! :)
(I really never understood the chunk/storage/TNC_PART* stuff...)

If that's really fixed, there would be indeed a lot of stuff to remove, and other stuff to implement - like the reverse of autoBoxChildren... - and I'm not sure I'll want to rework all that (when that "reload your book" suggestion takes care of that in a radical way :) but it's good to have this fixed!

But I'm still not sure how this affected your footnotes with FB2, while we had no issue with KOReader.

OK, may be there could be a in-between state (between "small book with no cache" and "large book with cache made at end"): "large book with cache made while loading because of memory limits" - that we didn't witness on our side because we increased a bunch of limits (see #100), that may be you did not ?

I'll look at your commits and pick your fixes - I'll probably won't kill the setBoxingWishedButPreventedByCache() anytime soon, because it's kinda working fine and safe for us :) (and would need lots of work, time and testing to be sure everything is fine.)

@virxkane
Copy link
Contributor

But I'm still not sure how this affected your footnotes with FB2, while we had no issue with KOReader.

This affects when DOC_BUFFER_SIZE have insufficient size, so most of the document content nodes are converted to an "inline" nodes. I didn’t understand your code very well, but it seems like that.

@virxkane
Copy link
Contributor

virxkane commented Jun 26, 2020

In ldomDataStorageManager::compact() if memory is exhausted ( condition (int)p->_bufsize + sumsize < _maxUncompressedSize failed ) document data swaped to cache file => we have cache and hasCacheFile() returns true.

@poire-z
Copy link
Contributor Author

poire-z commented Jun 26, 2020

Just curious: why at https://github.com/virxkane/coolreader/commit/e160b27c21635519ffd6515e7171f88772778419#diff-bf4790c6553b2dfbdb2c5d10684b9838L746 did you replace class ldomNode with struct ldomNode ?
It compiles without any complain - but it's strange that elsewhere, we keep using:

// forward declaration
class ldomNode;

friend class ldomNode;

In ldomDataStorageManager::compact() if memory is exhausted ( condition (int)p->_bufsize + sumsize < _maxUncompressedSize failed ) document data swaped to cache file => we have cache and hasCacheFile() returns true.

OK, I see. I guess on our side, we could have a loop an get stuck with the "Please reload" warning if we'd get a document too large that it would reach the limit. But strangely, I never witnessed that.

An other option for you, instead of getting rid of all my hasCacheFile() stuff - that may proves useful at some point - would have been to just change in lvtinydom.h:

     /// if a cache file is in use
-    bool hasCacheFile() { return _cacheFile != NULL; }
+    bool hasCacheFile() { return false; }

Or may be have that check a bit more clever. I guess we could still autobox stuff even if in the initial XML loading phase, we're at a point we have to create a cache file: may be as we keep just adding nodes, we're not boxing stuff that's already in the cache.
So, it could be changed to bool hasCacheFile() { return !in_xml_loading_phase && _cacheFile != NULL ; }

(I haven't yet managed to reproduce a crash like I had seen in the past when autoboxing with a cache file :/ - to see if your changes do fix that.)

@virxkane
Copy link
Contributor

virxkane commented Jun 26, 2020

Just curious: why at virxkane/coolreader@e160b27#diff-bf4790c6553b2dfbdb2c5d10684b9838L746 did you replace class ldomNode with struct ldomNode ?

Instances of type ldomNode allocated with alloca(), realloc(), not operator new[], cleared with memset(). Yes, ldomNode is lightweight class without polymorphism, without vtable and this work, but to make code more clear I rename to struct. Foward declaration just forgot to change, thank you, I make this later.

An other option for you, instead of getting rid of all my hasCacheFile() stuff - that may proves useful at some point

For example? Fact about using cache file at rendering stage, why?

Ok, basically I would like to share the fix for the old bug in tinyNodeCollection::loadNodeData(), but of course, this is you decision, what you do with this patch.

@poire-z
Copy link
Contributor Author

poire-z commented Jun 26, 2020

For example? Fact about using cache file at rendering stage, why?

I'd like to find a sample case that crashes with our current code (without your fix) if I remove the hasCacheFile stuff, but I have a hard time finding one :)
Anyway, the thing about preventing boxing stuff if there's a cache file started with #73, reworked in #86, with some details in koreader/koreader#3187 (comment) .
I have also witnessed some crashes that were fixed by preventing boxing in other contexts (like floats) if there is a cache file - so I guess for quite some time, it has been a reflex for me to add these kind of hasCacheFile checks. I still believe they might be needed.
But may be your loadNodeData and excludedChunk fixes do solve all this - I don't know yet :)

Ok, basically I would like to share the fix for the old bug in tinyNodeCollection::loadNodeData(), but of course, this is you decision, what you do with this patch.

I have already picked and adapted most of your recent changes (except the hasCacheFile removal), so thanks :)

@poire-z
Copy link
Contributor Author

poire-z commented Jun 27, 2020

OK, I managed to get some crashes after I have replaced the 15 occurences of:

-        if ( getDocument()->hasCacheFile() ) {
+        if ( false && getDocument()->hasCacheFile() ) {

One with a book with inline-block and floats (the book I've been playing with in #337 (comment)), when:
Set Block rendering mode to legacy, close book, clean cache, open book, close it (cache made), open it, set Block rendering mode to full featured ("web"), and voilà: crash:

Thread 1 "luajit" received signal SIGSEGV, Segmentation fault.
0x00007fffecacae30 in detectChildTypes (parent=parent@entry=0x5555562760d0, hasBlockItems=@0x7fffffffd3b7: false, hasInline=@0x7fffffffd3b8: false, hasInternalTableItems=@0x7fffffffd3d7: true,
    hasFloating=@0x7fffffffd3c0: false, detectFloating=detectFloating@entry=true) at crengine/src/../include/lvref.h:163
163         T * operator -> () const { return _ptr; }
(gdb) bt
#0  0x00007fffecacae30 in detectChildTypes (parent=parent@entry=0x5555562760d0, hasBlockItems=@0x7fffffffd3b7: false, hasInline=@0x7fffffffd3b8: false,
    hasInternalTableItems=@0x7fffffffd3d7: true, hasFloating=@0x7fffffffd3c0: false, detectFloating=detectFloating@entry=true)
    at crengine/src/../include/lvref.h:163
#1  0x00007fffecae3924 in ldomNode::initNodeRendMethod (this=this@entry=0x5555562760d0) at crengine/src/lvtinydom.cpp:6199
#2  0x00007fffecae69fe in updateRendMethod (node=0x5555562760d0) at crengine/src/lvtinydom.cpp:15537
#3  0x00007fffecac176c in ldomNode::recurseElementsDeepFirst (this=0x5555562760d0, pFun=0x7fffecae69eb <updateRendMethod(ldomNode*)>)
    at crengine/src/lvtinydom.cpp:15531
#4  0x00007fffecac1764 in ldomNode::recurseElementsDeepFirst (this=0x555556275fc0, pFun=0x7fffecae69eb <updateRendMethod(ldomNode*)>)
    at crengine/src/lvtinydom.cpp:15528
#5  0x00007fffecac1764 in ldomNode::recurseElementsDeepFirst (this=0x555556275fb0, pFun=0x7fffecae69eb <updateRendMethod(ldomNode*)>)
    at crengine/src/lvtinydom.cpp:15528
#6  0x00007fffecac1764 in ldomNode::recurseElementsDeepFirst (this=0x555556275f70, pFun=0x7fffecae69eb <updateRendMethod(ldomNode*)>)
    at crengine/src/lvtinydom.cpp:15528
#7  0x00007fffecac1764 in ldomNode::recurseElementsDeepFirst (this=0x555556275f60, pFun=0x7fffecae69eb <updateRendMethod(ldomNode*)>)
    at crengine/src/lvtinydom.cpp:15528
#8  0x00007fffecac1764 in ldomNode::recurseElementsDeepFirst (this=0x555556275f40, pFun=0x7fffecae69eb <updateRendMethod(ldomNode*)>)
    at crengine/src/lvtinydom.cpp:15528
#9  0x00007fffecac1764 in ldomNode::recurseElementsDeepFirst (this=0x555556274b80, pFun=0x7fffecae69eb <updateRendMethod(ldomNode*)>)
    at crengine/src/lvtinydom.cpp:15528
#10 0x00007fffecac1764 in ldomNode::recurseElementsDeepFirst (this=0x555556274b70, pFun=pFun@entry=0x7fffecae69eb <updateRendMethod(ldomNode*)>)
    at crengine/src/lvtinydom.cpp:15528
#11 0x00007fffecac1786 in ldomNode::initNodeRendMethodRecursive (this=<optimized out>) at crengine/src/lvtinydom.cpp:15549
#12 0x00007fffecadf1bb in ldomDocument::render (this=this@entry=0x555556189120, pages=pages@entry=0x555555f49468, callback=0x555555f4fe30, width=width@entry=424, dy=dy@entry=568,
    showCover=showCover@entry=false, y0=0, def_font=..., def_interline_space=95, props=...) at crengine/src/lvtinydom.cpp:4521
#13 0x00007fffecb95092 in LVDocView::Render (this=this@entry=0x555555f493c0, dx=424, dx@entry=0, dy=568, dy@entry=0, pages=0x555555f49468, pages@entry=0x0)

Another one with a Wikipedia EPUB with in-page footnotes, same workflow (legacy cache, switch to full featured):

Thread 1 "luajit" received signal SIGSEGV, Segmentation fault.
0x00007fffecaaa7ab in hasInvisibleParent (node=0x5555562ba4f0, node@entry=0x5555562bc1b0) at crengine/src/../include/lvref.h:163
163         T * operator -> () const { return _ptr; }
(gdb) bt
#0  0x00007fffecaaa7ab in hasInvisibleParent (node=0x5555562ba4f0, node@entry=0x5555562bc1b0) at crengine/src/../include/lvref.h:163
#1  0x00007fffecac2eaa in ldomNode::initNodeRendMethod (this=this@entry=0x5555562bc1b0) at crengine/src/lvtinydom.cpp:5801
#2  0x00007fffecac69fe in updateRendMethod (node=0x5555562bc1b0) at crengine/src/lvtinydom.cpp:15537
#3  0x00007fffecaa176c in ldomNode::recurseElementsDeepFirst (this=0x5555562bc1b0, pFun=0x7fffecac69eb <updateRendMethod(ldomNode*)>)
    at crengine/src/lvtinydom.cpp:15531
#4  0x00007fffecaa1764 in ldomNode::recurseElementsDeepFirst (this=0x5555562ba660, pFun=0x7fffecac69eb <updateRendMethod(ldomNode*)>)
    at crengine/src/lvtinydom.cpp:15528
#5  0x00007fffecaa1764 in ldomNode::recurseElementsDeepFirst (this=0x5555562ba570, pFun=0x7fffecac69eb <updateRendMethod(ldomNode*)>)
    at crengine/src/lvtinydom.cpp:15528
#6  0x00007fffecaa1764 in ldomNode::recurseElementsDeepFirst (this=0x5555562ba500, pFun=0x7fffecac69eb <updateRendMethod(ldomNode*)>)
    at crengine/src/lvtinydom.cpp:15528
#7  0x00007fffecaa1764 in ldomNode::recurseElementsDeepFirst (this=0x5555562ba4f0, pFun=0x7fffecac69eb <updateRendMethod(ldomNode*)>)
    at crengine/src/lvtinydom.cpp:15528
#8  0x00007fffecaa1764 in ldomNode::recurseElementsDeepFirst (this=0x5555562ba4e0, pFun=0x7fffecac69eb <updateRendMethod(ldomNode*)>)
    at crengine/src/lvtinydom.cpp:15528
#9  0x00007fffecaa1764 in ldomNode::recurseElementsDeepFirst (this=0x5555562ba4d0, pFun=pFun@entry=0x7fffecac69eb <updateRendMethod(ldomNode*)>)
    at crengine/src/lvtinydom.cpp:15528
#10 0x00007fffecaa1786 in ldomNode::initNodeRendMethodRecursive (this=<optimized out>) at crengine/src/lvtinydom.cpp:15549
#11 0x00007fffecabf1bb in ldomDocument::render (this=this@entry=0x5555561848a0, pages=pages@entry=0x555555f81378, callback=0x555555f0eab0, width=width@entry=512, dy=dy@entry=568,
    showCover=showCover@entry=false, y0=0, def_font=..., def_interline_space=100, props=...) at crengine/src/lvtinydom.cpp:4521
#12 0x00007fffecb75092 in LVDocView::Render (this=this@entry=0x555555f812d0, dx=512, dx@entry=0, dy=568, dy@entry=0, pages=0x555555f81378, pages@entry=0x0)
    at crengine/src/../include/crlocks.h:50

These kind of crashes might look like what I'd been used to see :)

And just applying this part of your recent changes is enough to fix these 2 crashes ! and have the boxings correctly made and content correctly displayed wrapped in inlineBoxes or floatBoxes.

--- a/crengine/src/lvtinydom.cpp
+++ b/crengine/src/lvtinydom.cpp
@@ -2201,17 +2204,32 @@ bool tinyNodeCollection::loadNodeData(lUInt16 type, ldomNode ** list, int nodeco
         int buflen;
         if (!_cacheFile->read( type, i, p, buflen ))
             return false;
-        ldomNode * buf = (ldomNode *)p;
-        if (!buf || (unsigned)buflen != sizeof(ldomNode) * sz)
+        if (!p || (unsigned)buflen != sizeof(ldomNode) * sz)
             return false;
-        list[i] = buf;
+        ldomNode * buf = (ldomNode *)p;
+        if (sz == TNC_PART_LEN)
+            list[i] = buf;
+        else
+        {
+            // buf contains `sz' ldomNode items
+            // _elemList, _textList (as `list' argument) must always be TNC_PART_LEN size
+            // add into `list' zero filled (TNC_PART_LEN - sz) items
+            list[i] = (ldomNode *)realloc(buf, TNC_PART_LEN * sizeof(ldomNode));
+            if (NULL == list[i])
+            {
+                free(buf);
+                CRLog::error("Not enough memory!");
+                return false;
+            }
+            memset( list[i] + sz, 0, (TNC_PART_LEN - sz) * sizeof(ldomNode) );
+        }
         for (int j=0; j<sz; j++) {
-            buf[j].setDocumentIndex( _docIndex );
-            if ( buf[j].isElement() ) {
+            list[i][j].setDocumentIndex( _docIndex );
+            if ( list[i][j].isElement() ) {
                 // will be set by loadStyles/updateStyles
-                //buf[j]._data._pelem._styleIndex = 0;
-                setNodeFontIndex( buf[j]._handle._dataIndex, 0 );
-                //buf[j]._data._pelem._fontIndex = 0;
+                //list[i][j]._data._pelem._styleIndex = 0;
+                setNodeFontIndex( list[i][j]._handle._dataIndex, 0 );
+                //list[i][j]._data._pelem._fontIndex = 0;
             }
         }
     }

So, well done !
(I still don't really understand what was happening, and why this patch solves all these issues :) Do you? And do you remember how you went to go suspecting that part of the code and know it was the cause ?)

So, yes, I guess we could remove all my hasCacheFile/setBoxingWishedButPreventedByCache.
I managed to switch back and forth between "legacy" and "web" renderings , without any popup asking for a reload; with that additional change:

--- a/crengine/include/lvtinydom.h
+++ b/crengine/include/lvtinydom.h
@@ -643,7 +643,7 @@ public:

     /// is built (and cached) DOM possibly invalid (can happen when some nodes have changed display style)
     bool isBuiltDomStale() {
-        return _nodeDisplayStyleHashInitial != NODE_DISPLAY_STYLE_HASH_UNITIALIZED &&
+        return false && _nodeDisplayStyleHashInitial != NODE_DISPLAY_STYLE_HASH_UNITIALIZED &&
                 _nodeDisplayStyleHash != _nodeDisplayStyleHashInitial;
     }
     void setNodeStylesInvalidIfLoading() {

It would be nice to be able to get rid of that, but it would need some additional care to properly reset rend methods of the previously inserted boxing elements that would no more be boxing (I might have already done that correctly for inlineBox and floatBox, but an autoboxing, which is display:block might need to be set to display:inline when it is no more needed if its siblings were to no more be blocks).
Or just unbox everything at start of re-rendering, and box again while rerendering as needed (to possibly get the same boxes).

But, given that in some cases, we remove from the DOM some empty text nodes, that can be ignored/removed between block nodes - but should be kept if these node styles are switched from display:block to display:inline, it might still be best to propose a full reload and rebuild of the DOM from the source XHTML.

@virxkane
Copy link
Contributor

(I still don't really understand what was happening, and why this patch solves all these issues :) Do you? And do you remember how you went to go suspecting that part of the code and know it was the cause ?)

I use address sanitizer to find this bug, reason I wrote in comments in method tinyNodeCollection::loadNodeData():
// _elemList, _textList (as `list' argument) must always be TNC_PART_LEN size
// add into `list' zero filled (TNC_PART_LEN - sz) items
Also affected fix in method ldomDataStorageManager::compact().
Options for cmake project: https://github.com/virxkane/coolreader/commit/c71851bf741dc943e2b240029e73cdb658406586

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants