Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEVPROD-14754 Start Using Git Scalar for Clones #8696

Merged
merged 13 commits into from
Feb 18, 2025

Conversation

malikchaya2
Copy link
Member

DEVPROD-14754

Description

Scalar is a repository management tool that optimizes Git for use in large repositories and improves performance. It will only use scalar if the git version supports it. It also unregisters any repositories in teardown that were cloned with scalar to stop the background maintenance of the repositories.

I chose to add scalar as a git.get_project parameter rather than as a default because it can't be combined with cloneDepth or recurseSubmodules and would therefore require a migration. We can consider migrating and adding it as a default down the line once more projects are using it successfully.

Testing

Tested on staging and added some unit tests.
Comparing two different executions of the same task running git.get_project, it went from 16.533s to 2.328s for git.get_project.

Documentation

Added

@malikchaya2 malikchaya2 requested a review from a team February 10, 2025 13:35
@@ -1254,6 +1260,45 @@ func (a *Agent) clearGitConfig(tc *taskContext) {
return
}
logger.Info("Cleared git credentials.")

if err := unregisterScalar(); err != nil {
logger.Error(errors.Wrap(err, "unregistering scalar repositories"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't comment on above lines, but would it be better to continue on the errors above? For example, let's say .gitconfig doesn't exist, does that mean ..git-credentials doesn't and we don't need to unregister scalar? I think logging still makes sense and it would need to be rewritten, but I'm thinking in terms of resilience

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, I rearranged it to log but not return on error

agent/agent.go Outdated
}
}

// unregisterScalar unregisters a repository that was cloned (and therefore registered with) Scalar.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit: move the with outside the parentheses (but ty for the really understandable/context in the comment! and adding it to that cleargitconfig func too)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

agent/agent.go Outdated
var out bytes.Buffer
cmd.Stdout = &out
err = cmd.Run()
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit: if err = cmd.Run(); err != nil {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

func unregisterScalar() error {
isScalarAvailable, err := agentutil.IsGitVersionMinimum(thirdparty.RequiredScalarGitVersion)

if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit: I'm okay with it either way, but we could do if err != nil || !isScalarAvailable because errors.Wrap will return nil for err = nil

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put this back based on Annie's feedback here: #8696 (comment)

agent/agent.go Outdated
c := exec.Command("scalar", "unregister", repoPath)
c.Stdout, c.Stderr = os.Stdout, os.Stderr
err = c.Run()
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit: if err = c.Run(); err != nil {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -112,6 +116,8 @@ func validateCloneMethod(method string) error {

func (opts cloneOpts) validate() error {
catcher := grip.NewBasicCatcher()
catcher.NewWhen(opts.useScalar && opts.cloneDepth > 0, "cannot use scalar with clone depth")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you put some comments or make the error message about why we can't? I know we haven't done this before for validation (look at s3.put for reference) but that's why I'm asking. I feel like the context gets lost and I end up looking at this validation and ask 'why can't it be used together' or if a user asks, I have to do a bit of research

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's fair, I added a comment with an explanation and a link to the scalar docs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought I remember now why I didn't add shallow_clone. It's because shallow_clone sets the clone depth to 100 so at this point there's only depth.

}

// IsGitVersionMinimum checks if the installed Git version is later than the specified version.
func IsGitVersionMinimum(minVersion string) (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper is quite neat but it's doing two things: 1. checking git version 2. checking if it's apple. The func itself doesn't state that and it (looking at the comment below) it looks specific to Scalar. SInce we only use it with Scalar, could you make this func in to IsGitVersionMinimumForScalar() maybe?

Also, could you put a link or docs as to where you found it wasn't bundled with Apple just for comments purposes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to IsGitVersionMinimumForScalar. I didn't find official enough docs on it, I happened to run into the error and figured it out.

@@ -112,6 +116,8 @@ func validateCloneMethod(method string) error {

func (opts cloneOpts) validate() error {
catcher := grip.NewBasicCatcher()
catcher.NewWhen(opts.useScalar && opts.cloneDepth > 0, "cannot use scalar with clone depth")
catcher.NewWhen(opts.useScalar && opts.recurseSubmodules, "cannot use scalar with recurse submodules")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I think this is missing shallowClone and useScalar but I'm not sure. You do it in the new gitFetchProject validate function

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, added it

{"2.37.0", "2.380", true},
{"2.38.0", "2.380", true},
{"2.38.0", "2.38.1", true},
{"2.38.1", "2.38.0", false},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been looking at these tests for a second, but I think something is backwards or unintuitive.

I'm comparing the first to the second, checking if the first meats the minimum version (which is the second). In this case (L100), 2.38.1 is past 2.38.0 by a patch version, but you expect false. It seems like all the test cases have the opposite of what I would expect, could you comment or switch some logic around if I'm interpreting it right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is very true, it was flipped. I'm confused about how this seemed to work while I was testing on staging, maybe I moved things around after testing...? Thank you for catching that!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so after some more testing and reading, I figured out what happened here. "IsVersionMinimum" checked if it's the minimum or less, and then IsGitVersionMinimum returned !IsVersionMinimum. The naming was not great. I flipped both and made it more clear.


for versionString, expected := range versionStrings {
parsedVersion, isApple, err := ParseGitVersion(versionString)
assert.NoError(t, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: require.NoError

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@ablack12 ablack12 self-requested a review February 13, 2025 16:25
@@ -98,6 +101,7 @@ type cloneOpts struct {
token string
recurseSubmodules bool
useVerbose bool
useScalar bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw your logic for this in the comments and I think it makes sense, but is there any reason not to use it if recurse_submodule and clone_depth aren't set? Should we just use it if those are unset (and the version is doable), and not use it if they are, rather than making it a user parameter? i.e. is there any reason not to use it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial thought was that if we want to make it the standard we probably also want to send out some comms and encourage people to stop using that so they can use scalar, making it slightly more involved and therefore maybe better as future work once it's working well rather than in the initial rollout.

On top of that, while I am not aware of a reason not to use it, I do have some general hesitations around rolling it out across the board. Users have so many different use cases and different creative ways in which they set up and use their projects that I fear that somewhere within this variation there will be someone who hits a use case that is incompatible with it. I was hoping that once some of the bigger and more complicated projects are using it without any issues, we'll be able to roll it out as a default with more confidence.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the first point -- it seems like regardless of whether we make this opt-in or opt-out, we need a follow-up ticket to ensure the improvements are what we expect, and then a step t o send out comms to tell people to opt-in (whether by doing so explicitly, or by reconsidering the things that opt you out).

To the latter, I think the bigger projects are more likely to run into a problem and it might be good to just hit it fast, rather than us saying we support a feature that we aren't comfortable with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if you'd like to discuss this over Zoom -- I mostly want to make sure we're owning this feature change properly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Discussed in DM -- we'll make a follow up ticket to pursue making this the default in the short-term, after some controlled beta testing)

agent/command/git.go Show resolved Hide resolved
if scalarAvailable {
gitCommand = "scalar clone --no-src"
} else {
logger.Task().Infof("cannot use scalar, git version is below %s", thirdparty.RequiredScalarGitVersion)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we do '%s' ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if opts.useScalar {
scalarAvailable, err := agentutil.IsGitVersionMinimumForScalar(thirdparty.RequiredScalarGitVersion)
if err != nil {
return nil, errors.Wrap(err, "checking git version")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we have an issue checking git version, should we just log and fall back to not available?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"environment": ExecutionEnvironmentType,
"args": flag.Args(),
})
// grip.EmergencyFatal(message.Fields{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be uncommented out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops yes

return matches[1], isAppleGit, nil
}

// VersionMeetsMinimum checks if the version is less or equal to the minVersion.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we reword to "returns true if the version is greater or equal to the minVersion" like you say in the test? This comment is a bit misleading for the equal case i think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

agent/agent.go Outdated
@@ -1238,6 +1241,9 @@ func (a *Agent) killProcs(ctx context.Context, tc *taskContext, ignoreTaskGroupC
}
}

// clearGitConfig removes up git files that were created in the home directory including
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: cleans up

@@ -98,6 +101,7 @@ type cloneOpts struct {
token string
recurseSubmodules bool
useVerbose bool
useScalar bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Discussed in DM -- we'll make a follow up ticket to pursue making this the default in the short-term, after some controlled beta testing)

agent/agent.go Outdated
func unregisterScalar() error {
isScalarAvailable, err := agentutil.IsGitVersionMinimumForScalar(thirdparty.RequiredScalarGitVersion)

if err != nil || !isScalarAvailable {
Copy link
Contributor

@ablack12 ablack12 Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if statement doesn't work, since in the case isScalarAvailable is false and err is nil, we're going to return no error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in that case we'll return nil which is accurate because it's not an error state, it just means we don't want to unregister scalar (because we can't and because it was never registered)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay gotcha; I think since this is a typical mistake, could we instead split this into a separate case and add a comment so its clear that it's purposeful?

if err != nil {
logger.Task().Errorf("checking git version failed, falling back to git clone instead of scalar clone: %s.", err)
}
if err != nil && scalarAvailable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean here if err == nil

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also do we want this to be an else if? Otherwise in the err != nil case, we're logging both checking git version failed and cannot use scalar, git version is below ...

@malikchaya2 malikchaya2 requested a review from ablack12 February 14, 2025 17:41
Copy link
Contributor

@ablack12 ablack12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with nit

agent/agent.go Outdated
func unregisterScalar() error {
isScalarAvailable, err := agentutil.IsGitVersionMinimumForScalar(thirdparty.RequiredScalarGitVersion)

if err != nil || !isScalarAvailable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay gotcha; I think since this is a typical mistake, could we instead split this into a separate case and add a comment so its clear that it's purposeful?

@malikchaya2 malikchaya2 merged commit 1cd4391 into evergreen-ci:main Feb 18, 2025
10 checks passed
@malikchaya2 malikchaya2 deleted the DEVPROD-14754 branch February 18, 2025 14:47
bynn pushed a commit to bynn/evergreen that referenced this pull request Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants