Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lvtinydom: fix memory leaks #533

Merged
merged 1 commit into from
Oct 22, 2023

Conversation

benoit-pierre
Copy link
Contributor

@benoit-pierre benoit-pierre commented Oct 11, 2023

Between 128 and ~2M when doing a full load on 23 (out of ~1K) of the files I tested.

Examples:


This change is Reviewable

@poire-z
Copy link
Contributor

poire-z commented Oct 11, 2023

Have you managed to understand what persist() does ? and all the whole business about NT_ELEMENT vs NT_PELEMENT, _elemStorage (which must not yet be storage to disk) and the whole lifecycle of all that?

This bits of code are stuff I added, so it's quite possible I didn't think about that - there's not many ->persist() calls in the codebase, and many are commented out (meaning they probably were causing crashes :).
Only place that does it for newly created nodes is moveItemsTo() - which was code from before I jumped in, used with autoBoxing nodes - so it does not look too odd.
Can't really say if it's the right thing to do or not. If you're sure, fine with me.

@benoit-pierre
Copy link
Contributor Author

Have you managed to understand what persist() does ? and all the whole business about NT_ELEMENT vs NT_PELEMENT, _elemStorage (which must not yet be storage to disk) and the whole lifecycle of all that?

FWIU, modify() switch to a temporary (less optimized) modifiable state, and persist() is the reverse, read-only, state.

Copy link
Contributor

@poire-z poire-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you just fixed/added that for the ones that were reported by your analyzing tools ? Or did you put some brain into all the ones that look similar, and decided some but not all needed it ?

@@ -6919,6 +6919,7 @@ int initTableRendMethods( ldomNode * enode, int state )
}
}
}
tbox->persist();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels this should also be needed for the abox created 5 lines above.
And also the others issues from boxWrapChildren(), line 7614, 7905, 7990, 8089.
And from insertChildElement(), quite a few of them.

Comment on lines 8153 to +8155
// We let <rp> be invisible like other unexpected elements
}
rbox3->persist();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these, there's also a few rbox1/2/3 in the large if ( needs_wrapping ) { just above.
I'm not sure (without rereading this code if it's the same nodes we're meeting there, or not. And if all the previous ones have to also be ->persist()'ed for consistency - or if it's best not to because we're meeting them again here :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ideally, each node should be persisted as soon as possible, when there's no need to modify it anymore.

@benoit-pierre
Copy link
Contributor Author

Have you just fixed/added that for the ones that were reported by your analyzing tools ?

Fixing them as I find them. ¯\(ツ)

But again, I tested 1316 EPUBs, including the ones in test/epub2-conform/.

@poire-z
Copy link
Contributor

poire-z commented Oct 13, 2023

But again, I tested 1316 EPUBs

Sure they contain enough ruby and "bad" documents (with incomplete tables) so all these codepaths are taken and checked?
In case you need some samples, here are my *table* and *ruby* test files: test-table-ruby.zip (don't worry if they look bad or fail, it's just random fuzzing input for your tools if you need some more :)

@benoit-pierre
Copy link
Contributor Author

But again, I tested 1316 EPUBs

Sure they contain enough ruby and "bad" documents (with incomplete tables) so all these codepaths are taken and checked?

No idea (TODO: add coverage to test script attached below).

In case you need some samples, here are my *table* and *ruby* test files: test-table-ruby.zip (don't worry if they look bad or fail, it's just random fuzzing input for your tools if you need some more :)

No leaks.

Here is what I use for testing: leaktest.zip. To be decompressed/run from the source tree.

There are a number of (mostly exclusive) switch. Use -a to use libasan (e.g. ./leaktest.sh -a test/*.{djvu,epub,pdf,txt}).

@poire-z
Copy link
Contributor

poire-z commented Oct 13, 2023

No leaks.

Ok. So do you feel like adding persists() to all others ? Or not, to be on the safest side ? If not, may be add a comment above your first persist() in a function that // This could/should probably be done to others around here or something like that (otherwise, when reredeaing it in 5 years, I'll wonder if it is normal that it is done there but not done here :/ and I won't dare adding them, cause if they're not there, there must be a reason :)

Here is what I use for testing

I don't have all the tools to make use of if - so trusting you with all this :)
One thing you could add after the initial open & render(), is to trigger some setting change and have the document fully re-rendered: they are different codepaths taken whether intiial load (ie. styles computer while the dom is built) and later rerender (styles recomputed on an existing DOM).

@benoit-pierre
Copy link
Contributor Author

Here is what I use for testing

I don't have all the tools to make use of if - so trusting you with all this :)

You don't need all the tools, just libasan.so, which should be provided by GCC. I've amended the main shell script to not hard-code the library path
, and add support for coverage with kcov: leaktest.zip.

Coverage results for lvtinydom.cpp on the aforementioned ~1K EPUBs: 46.8%…

One thing you could add after the initial open & render(), is to trigger some setting change and have the document fully re-rendered: they are different codepaths taken whether intiial load (ie. styles computer while the dom is built) and later rerender (styles recomputed on an existing DOM).

Any suggestions?

@poire-z
Copy link
Contributor

poire-z commented Oct 13, 2023

Any suggestions?

Something similar to that (taking the Harfbuzz paths you might not have taken? Although I dunno if this could add some noise/variations to your tests, if some optimisation makes the FreeType path faster but the Harfbuzz path slower, you may not notice it).

--- a/frontend/apps/reader/readerui.lua
+++ b/frontend/apps/reader/readerui.lua
@@ -312,8 +312,13 @@ function ReaderUI:init()
             self:handleEvent(Event:new("PreRenderDocument", self.doc_settings))

             start_time = time.now()
+            self.document:setFontKerning(1) -- Freetype only
             self.document:render()
             logger.dbg(string.format("  rendering took %.3f seconds", time.to_s(time.since(start_time))))
+            start_time = time.now()
+            self.document:setFontKerning(3) -- Harfbuzz
+            self.document:render()
+            logger.dbg(string.format("  rerendering took %.3f seconds", time.to_s(time.since(start_time))))

@Frenzie
Copy link
Member

Frenzie commented Oct 13, 2023

On a related thought, do any of those test documents have footnotes?

@poire-z
Copy link
Contributor

poire-z commented Oct 13, 2023

On a related thought, do any of those test documents have footnotes?

As the test script doesn't provice any CSS snippet to enable them, I guess the documents are rendered without. (It feels it doesn't really matter, the footnote code is quite localized and comes late in the party, but who knows if I have related memory leaks in my PageSplitState2 :)

I'm also not sure if the test script gets crengine to use epub.css or no default stylesheet at all - but I guess it's ok as most EPUBs have their own stylesheets. Also dunno about hyphenation if it is active by default.

@benoit-pierre
Copy link
Contributor Author

OK, here's a version that goes through readerui:

G_defaults = require("luadefaults"):open()
local DataStorage = require("datastorage")
G_reader_settings = require("luasettings"):open(DataStorage:getDataDir().."/settings.reader.lua")

local einkfb = require("ffi/framebuffer") --luacheck: ignore
einkfb.dummy = true --luacheck: ignore

local Device = require("device")

local Screen = Device.screen
Screen:init()

local CanvasContext = require("document/canvascontext")
CanvasContext:init(Device)

local Input = Device.input
Input.dummy = true

local ffiutil = require("ffi/util")
local cachedir = "/tmp/cr3cache"
local cre = require('document/credocument'):engineInit()

local DocumentRegistry = require("document/documentregistry")
local ReaderUI = require("apps/reader/readerui")
local UIManager = require("ui/uimanager")

ffiutil.purgeDir(cachedir)
cre.initCache(cachedir, 0, true, 40)

function fastforward_ui_events()
    -- Fast forward all scheduled tasks.
    UIManager:shiftScheduledTasksBy(-1e9)
    UIManager:run()
end

for i, b in ipairs(arg) do
    print(string.format("%d/%d %s", i, #arg, b))
    io.stdout:flush()
    print(b)
    doc = DocumentRegistry:openDocument(b)
    if doc then
        local readerui = ReaderUI:new{
            dimen = Screen:getSize(),
            document = doc,
        }
        UIManager:quit()
        UIManager:show(readerui)
        UIManager:scheduleIn(1, function()
            UIManager:close(readerui)
            ReaderUI.instance = readerui
        end)
        fastforward_ui_events()
        readerui:closeDocument()
        readerui:onClose()
        readerui = nil
        UIManager:quit()
        UIManager._exit_code = nil
    end
end

Device:exit()

No I need to figure a way to trigger a re-render.

@benoit-pierre
Copy link
Contributor Author

As for the presence of footnotes, yes, this one for example has plenty of footnotes.

@Frenzie
Copy link
Member

Frenzie commented Oct 13, 2023

In that case they merely need to be triggered. :)

@benoit-pierre
Copy link
Contributor Author

Meaning? The footnotes? (The readerui version does call CreDocument:setStyleSheet).

@Frenzie
Copy link
Member

Frenzie commented Oct 13, 2023

I think we use a styletweak enabled by default for fully standard footnotes (with another one I personally also enable by default that catches many pre-standard footnotes). Without looking at the code or the result I wouldn't really know what's going on. :-)

@poire-z
Copy link
Contributor

poire-z commented Oct 20, 2023

Ok. So do you feel like adding persists() to all others ? Or not, to be on the safest side ? If not, may be add a comment above your first persist() in a function that // This could/should probably be done to others around here or something like that (otherwise, when reredeaing it in 5 years, I'll wonder if it is normal that it is done there but not done here :/ and I won't dare adding them, cause if they're not there, there must be a reason :)

Answer & thoughts?

@benoit-pierre
Copy link
Contributor Author

After more experimenting with my test code, it looks like the leaks are also stylesheet specific: without the patch, I get leaks with the default crengine stylesheet, but not Koreader custom one, except for your test-table-ruby.zip sample.

Anyway, I would keep the PR as is, and not add more persist calls without a good reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants