-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use single map for collecting files across branches #839
Conversation
Here is the memory profile from dot com:
And the (truncated) output of
|
233878c
to
1564fe7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me! I am surprised the impact isn't bigger?
Baseline vs. PR
BenchmarkPrepareNormalBuild-10 1 1561963752 B/op 4249175 allocs/op
BenchmarkPrepareNormalBuild-10 1 1472635928 B/op 4238388 allocs/op
cmd/zoekt-repo-index/main.go
Outdated
@@ -359,12 +360,13 @@ func iterateManifest(mf *manifest.Manifest, | |||
return nil, nil, err | |||
} | |||
|
|||
files, versions, err := gitindex.TreeToFiles(topRepo, tree, projURL.String(), cache) | |||
rw := gitindex.NewRepoWalker(topRepo, projURL.String(), cache) | |||
versions, err := rw.CollectFiles(tree, rev, &ignore.Matcher{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to name this subVersions
, like in gitindex/index.go?
repos[k] = existing | ||
} else { | ||
repos[k] = BlobIndexInfo{Repo: v, Branches: []string{b}} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice optimization getting rid of this loop!
I was also surprised. When I looked at the local |
When looking at large profiles for
inuse_space
on dot com, I noticed the filename maps inprepareNormalBuild
taking a bunch of memory. This PR avoids allocating a separate map per branch, instead havingRepoWalker
collect all the entries in a single instance variable.This is one last piece of "low hanging fruit" related to index memory. After this, we need to take a holistic look and consider a true rewrite.
This helps allocations a bit on
sgtest/megarepo
. Note that I haven't been able to locally repro the type of profile we're seeing on dot com.Baseline vs. PR
Follow up to #838.