-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve css class matching #545
Conversation
A bit hard to get into with the existing, and your, names getAttribute/Value/Index, so better explicite naming may make it easier to follow.
It will probably fails with a non-small book, after you open and exit it (so a cache is created), and you re-open it (so, loading the DOM and attributes/values indexes from the cache) and re-render it: your You would either need to: We do A for You may have a hard time de/serializing (Thanks for not having me read About #529: looks like this is not one of the three ideas you mentionned? |
Thanks for the insights on the cache part. It will take some time before I can submit a working patch.
That's no accident, my dear friend :)
This is actually a scaled back version of my first idea. I experimented with hash, and also ID selectors, but haven't got a good result (yet).
Apologies for not replying further. It seemed I didn't make myself clear so I'd rather talk with code :p
I didn't fully get your idea in #529 until now :) |
IIRC, MuPDF uses perfect hashing via gperf for those ;). |
^ You mentionned that previously in #348. crengine/crengine/src/lvtinydom.cpp Lines 382 to 385 in 156d095
and that we could reuse I guess if we're not sure of our lString8::getHash() and collisions. |
MuPDF uses that for their CSS properties, which is known at compile time. In our case, the class names are dynamic, and I'm not sure gperf supports that. |
So I followed @poire-z 's input and implemented a version with |
Any idea why ?
Sure, I like it better for this reason. |
@@ -6022,14 +6049,25 @@ void LVStyleSheet::apply( const ldomNode * node, css_style_rec_t * style ) | |||
LVCssSelector * selector_0 = _selectors[0]; | |||
LVCssSelector * selector_id = id>0 && id<_selectors.length() ? _selectors[id] : NULL; | |||
|
|||
LVArray<lUInt32> class_hash_array; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Do you think there's any difference in speed between creating LVArray<lUInt32>
, adding into, and iterating them, vs. creating a single lString32 and using :pos()
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would dearly hope that an array would be significantly faster ;).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My wild guess is that they are of the same speed. lString32
is basically an LVArray<lUInt32>
plus a lot of string specific methods (and reference counting).
LVCssSelector now holds an LVRef to rules. This simplifies copying and destruction.
Well, to answer your question, I rerun the profiling and found that, for the whole load and render process, the new version is actually 0.18% faster (median) than the old one, and only 9% are slower. The old profiling data got overwritten so I can't really tell what went wrong. Memory allocation changes can ripple through the whole program with callgrind profiling. I did some manual comparison but didn't reach any conclusion. However, when a node has only one single class name, the old version doesn't calculate the attribute hash, or allocate an |
I'm now realizing that this simple solution won't allow for more similar optimisations :/ We would only be optimising the class check for the current node we are checking if it has a class= attribute, so optimising only If we want to optmize that too, we'd need to move the quick class hash checking in there: crengine/crengine/src/lvstsheet.cpp Lines 5166 to 5186 in 22191fd
actually replacing the string comparison here with the "hash present in an array of hashes" test (so, no need for a rarely used quickCheck() - just a quick regular check() :) So, maybe by reusing/tweaking some ideas from your initial implementation:
What do you think ? |
I've been experimenting the cache thing for a while. It is actually more involved than I expected. The patch is still far from reviewable, but I can share my two cents now since you ask:
My idea is to introduce a cache class ( In both Just as you said, the cache don't need to be stored in document cache. They are constructed on the fly during style calculation and discarded afterwards. I don't think the current commit could interfere with this design. We just need to cache the hash values in Implementation challenges:
|
Well, that's expected they are slow, given the multiple parent paths we need to check. I somehow can follow the rest of your ideas - but I don't really want to get much involved in these other complex node & style topics for now :) It's already complex, and I'm not sure I'd want more kludge if it is not elegant and not too intrusive. I was just focused on the scope of this PR, avoiding string comparisons in classname checks. |
Well.. misunderstanding again.. maybe I was too preoccupied by the whole I'd say your idea makes sense. Actually I had a failed attempt to completely remove the class string check. I'll do some experiments and come back to you. |
Class name matching is slow. It's extremely slow when there are many class selectors. Mitigate this problem by: 1. when parsing css, store class name's hash value 2. during css check, split class names and calculate their hashes 3. for class selectors, compare integers first to avoid expensive class name matching for most rules
Tweaking my initial implementation didn't deliver a better result than the simpler approach. It's still slower with my test set. So please consider merging my latest commit. Applying the quick check to |
I did some poor-man timings checks on 2 documents (one with some ancestors class selectors, timing around 1.6s - the other without much CSS but many nodes, timing around 10s), and I do observe (sometimes, not always, I don't often get consistent timings... dunno why) some noticable speedup, so I could be fine with that.
I'm not sure you got what I had in mind, and if you ended up testing what I imagined, so I went coding it, picking stuff from your initial implementation. Somehow, I don't notice the speedup I got with your current PR :/ I'm not sure why. Can you check how this patch does with your benchmarking tools ? |
I ran your patch through my test epubs. It was slower. I've attached the profiling result of one epub here: pr545.tgz. It's chosen because it has the median performance change among the 100 epubs. You may check it with KCacheGrind.
I can share with you my experiment with my initial implementation (and you're right: it's indeed not what you had in mind): bbshelper@e270699. Unfortunately It's also slightly slower than the current PR commit. I have also prototyped with the |
OK, thanks for your tests and thoughts. |
This is my first PR on #529
I've tested with load and render, but I've no knowledge of cache mechanism yet so not sure it works there. I publish them for review now because I want to avoid conflicts to subsequent commit(s) :)
Tested with two sets of books from Project Gutenburg, and cycle estimations of
LoadDocument()
reported by callgrind are as follows (fewer is better):This change is