Skip to content

do not deep copy history entries #3248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 27, 2020
Merged

Conversation

vladak
Copy link
Member

@vladak vladak commented Sep 25, 2020

This was meant to help a bit #3243 however it does not do much for my particular use case as described in the issue. Still, should save some heap memory in case of running the indexer with -G.

@vladak vladak added the indexer label Sep 25, 2020
@vladak vladak changed the title History tagging fix do not deep copy history entries Sep 25, 2020
@coveralls
Copy link

coveralls commented Sep 25, 2020

Pull Request Test Coverage Report for Build 5467

  • 10 of 10 (100.0%) changed or added relevant lines in 1 file are covered.
  • 11 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.03%) to 75.682%

Files with Coverage Reduction New Missed Lines %
opengrok-indexer/src/main/java/org/opengrok/indexer/configuration/IndexTimestamp.java 2 42.86%
opengrok-indexer/src/main/java/org/opengrok/indexer/history/HistoryEntry.java 9 87.18%
Totals Coverage Status
Change from base Build 5462: -0.03%
Covered Lines: 41337
Relevant Lines: 54619

💛 - Coveralls

@idodeclare
Copy link
Contributor

@vladak what repo might have required the deep copy? It was quite explicit, so wondering if you’ve accounted for all use-cases.

@vladak
Copy link
Member Author

vladak commented Sep 25, 2020

@vladak what repo might have required the deep copy? It was quite explicit, so wondering if you’ve accounted for all use-cases.

I tried to come up with a reason for this code and could not find any, at least w.r.t. tags.

@vladak
Copy link
Member Author

vladak commented Sep 28, 2020

For the record the comment about deep copy comes from 3ba66fb by @skozina that added tagging capability to history view.

@vladak
Copy link
Member Author

vladak commented Oct 8, 2020

For this one I think the risk is worth the benefit.

@idodeclare
Copy link
Contributor

For this one I think the risk is worth the benefit.

I'm skeptical firstly because you're removing code conditional on repository.hasFileBasedTags() where only a tiny number of Repository subclasses are overriding hasFileBasedTags() and secondly because you write that "it does not do much for my particular use case." What benefit is there evidence for?

@vladak
Copy link
Member Author

vladak commented Oct 9, 2020

Right, I should have measured this first. Will try once indexer can gather JVM statistics (#3245).

@vladak
Copy link
Member Author

vladak commented Oct 9, 2020

I think it did not do much for my case because the overall memory overhead just trumped the benefit of this change by large margin.

@vladak vladak force-pushed the history_tagging_fix branch from d91d17e to 7079ff6 Compare October 10, 2020 16:16
@vladak
Copy link
Member Author

vladak commented Oct 10, 2020

Turned out my hunch was correct. I chose the same working set as in #3243 and same indexer options (obviously including -G) and same source code basis to avoid running git log with -m.

Here are JVM used memory graphs without and with the fix:
without -m and without fix
without -m and with fix

Both graphs show complete indexing process. This is first such time we have introspection into what is happening there. The first steep fall in the graphs correlate with the finish of the git log processes. The second graph spans sightly wider because there was another indexer running on the machine at that time. The history cache generation finished at 17:00 and 19:15. The first indexer finished around 17:40. The second peak of old generation area in the second graph is interesting, I don't have good explanation for it.

In overall this saved some 6 GB of heap on average during the history cache creation phase.

It might impact only a small subset of repository types however Git/Mercurial are widely used so the impact is high.

Side node: beware, exporting indexer metrics over statsd can quickly consume your mobile data plan limits as it can easily generate GBs of traffic.

@vladak
Copy link
Member Author

vladak commented Oct 11, 2020

The reason why this did not help with the case described in #3243 is that the effect of this change is visible only after the acquire of history entries has finished and the issue prevents that from completing.

@vladak vladak force-pushed the history_tagging_fix branch from 152093a to c6fdd29 Compare October 27, 2020 19:49
@vladak vladak merged commit 0b89ff1 into oracle:master Oct 27, 2020
@vladak vladak deleted the history_tagging_fix branch October 27, 2020 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants