Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented Artifactory upload and download handling #30

Merged
merged 2 commits into from
Dec 13, 2024

Conversation

eeasaa01
Copy link
Contributor

@eeasaa01 eeasaa01 commented Nov 5, 2024

Implemented Artifactory Cache Integration

Changes

  • Added GRABIT_ARTIFACTORY_TOKEN support
  • Implemented cache download priority
  • Added fallback to source URLs
  • Added integrity validation
  • Added unit tests for cache functionality
  • Added integration tests
  • Fixed logging levels
  • Added proper error handling

Testing Done

  • Added TestGrabitRequirements for comprehensive testing
  • Tested token validation
  • Tested cache download and fallback
  • Tested NO_CACHE_UPLOAD
  • Tested error handling
  • All tests passing

Changes Made Based on Feedback

  • Fixed authentication error handling
  • Adjusted log levels for cache operations
  • Improved error messages
  • Added proper validation at each step
  • Added comprehensive test coverage

Checklist

  • I have read the contributing guidelines
  • Functionality is documented
  • All code style checks pass
  • New code contribution is covered by automated tests
  • All new and existing tests pass

Copy link
Contributor

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be quite a bit of functionality and many changes that are not directly related to the Artifactory / caching functionality, which makes this PR a bit hard to read and review.

Also, can we get the writeup of what we decided would be the first cut of this functionality? Without that, @rabadin can't properly evaluate this code.

I assume that you notice that the CI tests are failing, too. You'll want to make test before you push again and make sure everything passes.

Finally, let's talk in the next meeting about how to add some unit tests for this functionality.

@jsquyres
Copy link
Contributor

jsquyres commented Nov 7, 2024

Note that this PR now has a (simple) conflict that has to be fixed. Github won't run the CI on a PR that has conflicts, which is why there's currently no CI results here.

@eeasaa01 eeasaa01 force-pushed the feature/artifactory-upload branch from 082c98b to c389fb2 Compare November 12, 2024 20:38
Copy link
Contributor

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to remove the a640.... binary.

FWIW: Using git add -a or git commit -a is dangerous. It is much safer to git add ...list of files... or git add -p to select precisely which things you want to add to the staging area.

internal/lock.go Outdated
}
defer file.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it yes. A small one since this would be GCed but still.

@eeasaa01 eeasaa01 marked this pull request as draft November 13, 2024 13:34
@eeasaa01 eeasaa01 marked this pull request as ready for review November 13, 2024 16:46
Copy link
Contributor

@rabadin rabadin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks generally okay... but I really don't understand why you're adding so much changes that are completely unrelated to the task at hand and that, in most cases, are not real improvements. See my comments inline.

Also: before this change the coverage is 81.8& and after it is 80.4%... could you improve the coverage of the newly-added code to ensure that the coverage doesn't go down with this change?

@eeasaa01 eeasaa01 marked this pull request as draft November 18, 2024 12:31
@eeasaa01 eeasaa01 marked this pull request as ready for review November 18, 2024 21:00
@jsquyres
Copy link
Contributor

It looks like there were some bad merges that were not fixed properly before committing. For example, I see a full Open MPI tarball added in 4033f12. Word to the wise: you should avoid using git add -a or git commit -a. It is far to easy to commit files that you do not intend.

Additionally, make test fails to compile the tests (it looks like there's some git conflict markers that were left in the source files). This is one reason that the CI is failing. In general, you should probably ensure that grabit builds properly and the tests are passing before pushing up to github.

internal/lock.go Outdated
return err
}
tmpPath := tmpFile.Name()
defer os.Remove(tmpPath)
Copy link
Contributor

@rabadin rabadin Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're doing a remove... and on line 270 you're doing a rename... why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have resolved this without giving me a good explanation... re-opening...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was to trying to make sure the temporary file is deleted if something goes wrong. If everything works fine, the file get renamed and moved to where it needs to be.
Is there a better way to do this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes: just remove all this code and re-use the code that exists; you only need to add support for a bearer token:

diff --git a/internal/resource.go b/internal/resource.go
index a377e87..aab11b0 100644
--- a/internal/resource.go
+++ b/internal/resource.go
@@ -45,17 +45,20 @@ func NewResourceFromUrl(urls []string, algo string, tags []string, filename stri
 }

 // getUrl downloads the given resource and returns the path to it.
-func getUrl(u string, fileName string, ctx context.Context) (string, error) {
+func getUrl(u string, fileName string, bearer string, ctx context.Context) (string, error) {
 	_, err := url.Parse(u)
 	if err != nil {
 		return "", fmt.Errorf("invalid url '%s': %s", u, err)
 	}
 	log.Debug().Str("URL", u).Msg("Downloading")
-	err = requests.
+	req := requests.
 		URL(u).
 		Header("Accept", "*/*").
-		ToFile(fileName).
-		Fetch(ctx)
+		ToFile(fileName)
+	if bearer != "" {
+		req.Header("Authorization", fmt.Sprintf("Bearer %s", bearer))
+	}
+	err = req.Fetch(ctx)
 	if err != nil {
 		return "", fmt.Errorf("failed to download '%s': %s", u, err)
 	}
@@ -64,12 +67,12 @@ func getUrl(u string, fileName string, ctx context.Context) (string, error) {
 }

 // GetUrlToDir downloads the given resource to the given directory and returns the path to it.
-func GetUrlToDir(u string, targetDir string, ctx context.Context) (string, error) {
+func GetUrlToDir(u string, targetDir string, bearer string, ctx context.Context) (string, error) {
 	// create temporary name in the target directory.
 	h := sha256.New()
 	h.Write([]byte(u))
 	fileName := filepath.Join(targetDir, fmt.Sprintf(".%s", hex.EncodeToString(h.Sum(nil))))
-	return getUrl(u, fileName, ctx)
+	return getUrl(u, fileName, bearer, ctx)
 }

 // GetUrlWithDir downloads the given resource to a temporary file and returns the path to it.
@@ -79,7 +82,7 @@ func GetUrltoTempFile(u string, ctx context.Context) (string, error) {
 		log.Fatal().Err(err)
 	}
 	fileName := file.Name()
-	return getUrl(u, fileName, ctx)
+	return getUrl(u, fileName, "", ctx)
 }

 func (l *Resource) Download(dir string, mode os.FileMode, ctx context.Context) error {
@@ -92,7 +95,7 @@ func (l *Resource) Download(dir string, mode os.FileMode, ctx context.Context) e
 	for _, u := range l.Urls {
 		// Download file in the target directory so that the call to
 		// os.Rename is atomic.
-		lpath, err := GetUrlToDir(u, dir, ctx)
+		lpath, err := GetUrlToDir(u, dir, "", ctx)
 		if err != nil {
 			downloadError = err
 			continue

@rabadin
Copy link
Contributor

rabadin commented Nov 27, 2024

@amdurh05 Also, you have a random 11M binary file in here.

@rabadin
Copy link
Contributor

rabadin commented Dec 2, 2024

@eeasaa01 Btw, you have a conflict that you need to fix by rebasing your work onto the latest main.

@rabadin
Copy link
Contributor

rabadin commented Dec 2, 2024

@eeasaa01 I suggested doing a rebase. You did a merge. That's going to potentially complicate the process when you squash all your changes into just one commit.

@eeasaa01
Copy link
Contributor Author

eeasaa01 commented Dec 2, 2024 via email

@jsquyres
Copy link
Contributor

jsquyres commented Dec 2, 2024

I did the merge through the GitHub website, as it was the only option I saw there—apologies for that.

In the future: although GitHub does have the option for Rebasing (instead of Merging), you might be better served doing the operation on the command line or in your local IDE -- it might be slightly less confusing as to what is actually happening.

As we talked about in our status meetings, merging the target branch into your PR branch frequently leads to pain in the long run. It should be avoided whenever possible. Rebase will generally spare you that pain.

Also, I now see the tests failing in two files: lock_test and download_test. What do you recommend to fix these?

Your PR removed the 5th argument from lock.Download().

https://github.com/cisco-open/grabit/pull/30/files#diff-4d8cb8fc3d173e6b0cd63fafbe1a7ee6641bddcea1fc7e13d6c07b00d1d684c8L92-R184

  • If you didn't mean to do that, put that 5th argument back.
  • If you did mean to do that, then everywhere that calls lock.Download() needs to be adjusted to remove that 5th argument.

@jsquyres jsquyres force-pushed the feature/artifactory-upload branch from 672863d to c7aaa93 Compare December 3, 2024 00:52
@jsquyres
Copy link
Contributor

jsquyres commented Dec 3, 2024

@eeasaa01 and I are DM'ing in Teams to try to clean this up. I just took a first step here, and did the following:

  1. Removed the problematic merge (and all the problematic commits that came after it) from earlier today
  2. Squashed all the commits from after 3396a6f into a single commit

We still have a problematic merge in the branch on this commit. The merge from eeasaa01#1 contains a problematic merge from main as well, which is still causing more conflicts. Still need to resolve this...

@eeasaa01 eeasaa01 force-pushed the feature/artifactory-upload branch 7 times, most recently from 3520e1d to 971b190 Compare December 4, 2024 16:41
@@ -19,6 +19,8 @@ jobs:
go-version-file: 'go.mod'

- name: Test
env:
GRABIT_ARTIFACTORY_TOKEN: team2grabit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's a good idea. You want to control this in code via https://pkg.go.dev/testing#B.Setenv. This way you can also test what happens when it's not set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this way it may work, but okay, I will use your suggestion right away.

@eeasaa01 eeasaa01 force-pushed the feature/artifactory-upload branch 4 times, most recently from 9566766 to 2123068 Compare December 4, 2024 22:16
internal/lock.go Outdated

err := deleteCache(artifactoryURL, token)
if err != nil {
log.Warn().Msg("Warning: Unable to delete from Artifcatory")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: Artifcatory

internal/lock.go Outdated
@@ -89,7 +176,7 @@ func strToFileMode(perm string) (os.FileMode, error) {

// Download gets all the resources in this lock file and moves them to
// the destination directory.
func (l *Lock) Download(dir string, tags []string, notags []string, perm string, status bool) error {
func (l *Lock) Download(dir string, tags []string, notags []string, perm string, someBool bool) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this change, why the name change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn’t really change it. When I was testing, the function needed five arguments, so I added a temporary one called someBool. Later, when I added Dylan’s code, I saw it should be status, so I fixed it.

internal/lock.go Outdated
return err
}
tmpPath := tmpFile.Name()
defer os.Remove(tmpPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have resolved this without giving me a good explanation... re-opening...

internal/lock.go Outdated
return err
}
tmpPath := tmpFile.Name()
defer os.Remove(tmpPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes: just remove all this code and re-use the code that exists; you only need to add support for a bearer token:

diff --git a/internal/resource.go b/internal/resource.go
index a377e87..aab11b0 100644
--- a/internal/resource.go
+++ b/internal/resource.go
@@ -45,17 +45,20 @@ func NewResourceFromUrl(urls []string, algo string, tags []string, filename stri
 }

 // getUrl downloads the given resource and returns the path to it.
-func getUrl(u string, fileName string, ctx context.Context) (string, error) {
+func getUrl(u string, fileName string, bearer string, ctx context.Context) (string, error) {
 	_, err := url.Parse(u)
 	if err != nil {
 		return "", fmt.Errorf("invalid url '%s': %s", u, err)
 	}
 	log.Debug().Str("URL", u).Msg("Downloading")
-	err = requests.
+	req := requests.
 		URL(u).
 		Header("Accept", "*/*").
-		ToFile(fileName).
-		Fetch(ctx)
+		ToFile(fileName)
+	if bearer != "" {
+		req.Header("Authorization", fmt.Sprintf("Bearer %s", bearer))
+	}
+	err = req.Fetch(ctx)
 	if err != nil {
 		return "", fmt.Errorf("failed to download '%s': %s", u, err)
 	}
@@ -64,12 +67,12 @@ func getUrl(u string, fileName string, ctx context.Context) (string, error) {
 }

 // GetUrlToDir downloads the given resource to the given directory and returns the path to it.
-func GetUrlToDir(u string, targetDir string, ctx context.Context) (string, error) {
+func GetUrlToDir(u string, targetDir string, bearer string, ctx context.Context) (string, error) {
 	// create temporary name in the target directory.
 	h := sha256.New()
 	h.Write([]byte(u))
 	fileName := filepath.Join(targetDir, fmt.Sprintf(".%s", hex.EncodeToString(h.Sum(nil))))
-	return getUrl(u, fileName, ctx)
+	return getUrl(u, fileName, bearer, ctx)
 }

 // GetUrlWithDir downloads the given resource to a temporary file and returns the path to it.
@@ -79,7 +82,7 @@ func GetUrltoTempFile(u string, ctx context.Context) (string, error) {
 		log.Fatal().Err(err)
 	}
 	fileName := file.Name()
-	return getUrl(u, fileName, ctx)
+	return getUrl(u, fileName, "", ctx)
 }

 func (l *Resource) Download(dir string, mode os.FileMode, ctx context.Context) error {
@@ -92,7 +95,7 @@ func (l *Resource) Download(dir string, mode os.FileMode, ctx context.Context) e
 	for _, u := range l.Urls {
 		// Download file in the target directory so that the call to
 		// os.Rename is atomic.
-		lpath, err := GetUrlToDir(u, dir, ctx)
+		lpath, err := GetUrlToDir(u, dir, "", ctx)
 		if err != nil {
 			downloadError = err
 			continue

// TestDownloadWithArtifactoryCache verifies downloading resources with caching enabled.
func TestDownloadWithArtifactoryCache(t *testing.T) {
// Sub-test to verify behavior when NO_CACHE_UPLOAD is set.
t.Run("NO_CACHE_UPLOAD", func(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this: there is no need for NO_CACHE_UPLOAD. So this test can go.

// TestAddWithArtifactoryCache verifies adding a resource with caching enabled.
func TestAddWithArtifactoryCache(t *testing.T) {
// Sub-test to verify behavior when the token is not set.
t.Run("TokenNotSet", func(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of using t.Run if you only have one t.Run per test?

// Clear the GRABIT_ARTIFACTORY_TOKEN environment variable.
t.Setenv("GRABIT_ARTIFACTORY_TOKEN", "")

// Setup a simple HTTP handler that always returns "test content".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is all this setup really necessary if you're only testing the case where GRABIT_ARTIFACTORY_TOKEN isn't set?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rabadin Looks like without defining "t.Setenv("GRABIT_ARTIFACTORY_TOKEN", "")" the test fails when I run "make test" but passes when I run the test directly. Do you have any recommendations for this?

Went ahead and pushed this change to see if it passes the CI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by "when I run the test directly". Do you mean running this individual test in isolation or manually testing things?
I see that the tests pass right but you're not testing all the functionality, which is reflected in the new coverage number (with this PR): 75.8%.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rabadin I mean when I run the test in isolation.

I think one way to up the testing coverage could be testing the normal Artifactory download functionality, I don't think this is currently being tested. If you think this would be a good test to add I can work on adding it when I get home this evening.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By definition the UT coverage is the coverage of the UT.

Copy link
Contributor

@rabadin rabadin Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, yes, you clearly need to test:
a) the happy path

  • Adding a resources with a cache
  • Downloading a resource with a cache, getting the data from the cache (using mocking, as discussed)
  • Deleting a resource with a cache, deleting the cache (using mocking, as discussed)

b) the non happy path

  • Adding resource with a cache but no token env var
  • Downloading a resource with a cache, but the cache GET errors
  • Deleting a resource with a cache, and deleting the cache works

Copy link
Contributor

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like the progress bar code was re-integrated properly. When I run:

$ ./grabit download --status
✔[████████████████████]          3/3 Resources          126 MB / 126 MB          13s elapsed 

I see an empty status bar for 13 seconds (and 0/3 Resources 0 B / 126 MB) and then all the sudden it's full / done. Something is not right -- it should be continually updating with its actual status.

@MDylanWilliams
Copy link
Contributor

MDylanWilliams commented Dec 5, 2024

It doesn't look like the progress bar code was re-integrated properly. When I run:

$ ./grabit download --status
✔[████████████████████]          3/3 Resources          126 MB / 126 MB          13s elapsed 

I see an empty status bar for 13 seconds (and 0/3 Resources 0 B / 126 MB) and then all the sudden it's full / done. Something is not right -- it should be continually updating with its actual status.

The SL seems to be working with normal HTTP resources. I do not have Artifactory set up on my machine -- does this only happen when downloading from Artifactory?

The SL should be updating properly if Increment is called after every successful download.

@jsquyres
Copy link
Contributor

jsquyres commented Dec 5, 2024

@MDylanWilliams I think this was a bad test on my part. With more proper testing on my side, I think you're right -- it is working properly. Sorry for the confusion!

@jsquyres jsquyres dismissed their stale review December 5, 2024 16:54

I tested incorrectly -- the status bar does appear to be working properly.

@eeasaa01 eeasaa01 force-pushed the feature/artifactory-upload branch 4 times, most recently from 72ff3bd to 811e0f4 Compare December 7, 2024 08:54
- Added GRABIT_ARTIFACTORY_TOKEN support
- Implemented cache download priority
- Added fallback to source URL
- Added validation at each step
- Improved error handling and logging
- Integrated cache upload functionality

Co-authored-by: Amin Assaid <[email protected]>
Co-authored-by: Amy Druham <[email protected]>
@rabadin rabadin force-pushed the feature/artifactory-upload branch from 811e0f4 to ae77c89 Compare December 11, 2024 13:39
- Rename flag and internal arguments to clarify that this caching is specific to artifactory
- Refactor tests to cover both the happy path and critical error paths
- Move code and test to resource*.go
- Add integrity checking for cached artifacts
@rabadin rabadin merged commit dfd0d3f into cisco-open:main Dec 13, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants