Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advanced copy API needs PRN support #3853

Closed
dralley opened this issue Jan 9, 2025 · 6 comments · Fixed by #3864
Closed

Advanced copy API needs PRN support #3853

dralley opened this issue Jan 9, 2025 · 6 comments · Fixed by #3864
Assignees
Labels
Milestone

Comments

@dralley
Copy link
Contributor

dralley commented Jan 9, 2025

As part of the effort to move Pulp APIs over to "PRNs" for various reasons (see pulp/pulpcore#5766), we need to update the advanced copy API, which unlike the other APIs didn't inherit support from pulpcore because it's custom.

The two places that definitely need updates are:

https://github.com/pulp/pulp_rpm/blob/main/pulp_rpm/app/viewsets/repository.py#L764C1-L765C55
https://github.com/pulp/pulp_rpm/blob/main/pulp_rpm/app/serializers/repository.py#L582-L583

@dralley
Copy link
Contributor Author

dralley commented Jan 16, 2025

@dralley dralley added this to the 3.28 milestone Jan 16, 2025
@dralley dralley added Feature and removed Task labels Jan 16, 2025
@pedro-psb pedro-psb self-assigned this Jan 16, 2025
pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue Jan 23, 2025
pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue Jan 23, 2025
pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue Jan 23, 2025
@pedro-psb
Copy link
Member

@dralley @ggainey On the copy task we do filter for the domain, but I assume we want a nice error message if there any content is not part of domain.

I'm also assuming doing an extra count query is our cheaper option:

  1. compare the number of units provided by the user with the number of those content in the current domain (the extra count query), which should be equal.
  2. If validation fails, make a new query to display what content does not belong to the domain before raising.

Wdyt?

@dralley
Copy link
Contributor Author

dralley commented Jan 24, 2025

Nice error messages are a bonus but not necessarily a requirement - nobody is complaining about the current state of things after all, and that would be an issue with the current code too yeah?

Of course, it would be nice :)

There's a few different ways you could structure it and I'm not sure which is cheapest. e.g. options might be

content.filter(pk__in=list_of_pks_from_prn).exclude(pulp_domain=domain).count()

or

content.filter(pk__in=list_of_pks_from_prn).values("pulp_domain").distinct()

Sidenote: how much are we just "assuming" that these are too expensive to do in the viewset?

@pedro-psb
Copy link
Member

and that would be an issue with the current code too yeah?

True. The current code does identify the wrong domain href, but only the first, so it's not really too different in terms of usefulness for the user. But on the task context we can afford the better message for failures.

Sidenote: how much are we just "assuming" that these are too expensive to do in the viewset?

I'm 100% assuming. Do you have some estimates for safe numbers we could test? I can sync up a big repo on-demand and do a request with lots of content.

@ggainey
Copy link
Contributor

ggainey commented Jan 24, 2025

I'm 100% assuming. Do you have some estimates for safe numbers we could test? I can sync up a big repo on-demand and do a request with lots of content.

I have a (very vague) memory of running into timeout issues in the view when testing with a Large Number of HREFs - but it would have been literally years ago, I may be misremembering. The katello usecase does result in some Very Large lists - but it's def worth getting some actual results.

Being able to make the decision with a single call makes a huge difference - in my head, it was going to cost one db-access per PRN, which would be Bad. Limiting the fields-being-returned from the queries suggested by @dralley will help some as well, both performance but especially memory.

@pedro-psb pedro-psb linked a pull request Jan 24, 2025 that will close this issue
@ianballou
Copy link
Contributor

Katello implemented "chunked copying" with our use of the advanced copy API, and that is capped at 10,000 hrefs.

We only use the advanced copy API with dependency solving. Otherwise we use the normal add and remove content APIs.

pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue Jan 28, 2025
* Add cross_domain validation for PRNs on copy API serializer. The new
  validation doesn't capture the first href/prn that failed anymore,
  as it relies on querying distinct domains in the bulk of references.

Fixes: pulp#3853
pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue Jan 28, 2025
* Add cross_domain validation for PRNs on copy API serializer. The new
  validation doesn't capture the first href/prn that failed anymore,
  as it relies on querying distinct domains in the bulk of references.

Fixes: pulp#3853
pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue Jan 29, 2025
* Add cross_domain validation for PRNs on copy API serializer. The new
  validation doesn't capture the first href/prn that failed anymore,
  as it relies on querying distinct domains in the bulk of references.

Fixes: pulp#3853
pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue Jan 29, 2025
* Add cross_domain validation for PRNs on copy API serializer. The new
  validation doesn't capture the first href/prn that failed anymore,
  as it relies on querying distinct domains in the bulk of references.

Fixes: pulp#3853
pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue Jan 30, 2025
* Add cross_domain validation for PRNs on copy API serializer. The new
  validation doesn't capture the first href/prn that failed anymore,
  as it relies on querying distinct domains in the bulk of references.

Fixes: pulp#3853
pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue Jan 31, 2025
* Add cross_domain validation for PRNs on copy API serializer. The new
  validation doesn't capture the first href/prn that failed anymore,
  as it relies on querying distinct domains in the bulk of references.

Fixes: pulp#3853
pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue Jan 31, 2025
* Add cross_domain validation for PRNs on copy API serializer. The new
  validation doesn't capture the first href/prn that failed anymore,
  as it relies on querying distinct domains in the bulk of references.

Fixes: pulp#3853
pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue Jan 31, 2025
* Add cross_domain validation for PRNs on copy API serializer. The new
  validation doesn't capture the first href/prn that failed anymore,
  as it relies on querying distinct domains in the bulk of references.

Fixes: pulp#3853
pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue Jan 31, 2025
* Add cross_domain validation for PRNs on copy API serializer. The new
  validation doesn't capture the first href/prn that failed anymore,
  as it relies on querying distinct domains in the bulk of references.

Fixes: pulp#3853
pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue Feb 4, 2025
* Add cross_domain validation for PRNs on copy API serializer. The new
  validation doesn't capture the first href/prn that failed anymore,
  as it relies on querying distinct domains in the bulk of references.

Fixes: pulp#3853

fixup: review changes

fixup: review changes on class type validation
dralley pushed a commit that referenced this issue Feb 5, 2025
* Add cross_domain validation for PRNs on copy API serializer. The new
  validation doesn't capture the first href/prn that failed anymore,
  as it relies on querying distinct domains in the bulk of references.

Fixes: #3853

fixup: review changes

fixup: review changes on class type validation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants