Constrain use of hook prePuller #3341

consideRatio · 2023-10-30T00:01:33Z

This is a followup on #3313 (comment) to still retain use of hook prePuller, but to constrain its use.

The use of hook image puller is constrained to:

Not using the jupyterhub-configurator at the same time
This is enforced by the chart schema.
Not using one or more images set in any other way than by singleuser.image
This isn't enforced as its hard to enforce surgically enough to not constrain use of profileList beyond what we want. Due to that, a comment is made next to prePuller.hook.enabled to help us keep track of this. I'm quite strongly opinionated that this comment is retained inline because I think its too hard for us to enforce this policy if its only declared in external docs.

Opt-out of jupyterhub-configurator

With jupyterhub.custom.jupyterhubConfigurator.enabled individual hubs can opt-out of using it. When disabled, it won't present itself to users via the JupyterHub menu under services, and any options set historically when enabled are ignored.

Hub specific changes

Cloudbank

No impact on the communities are expected except for the jupyterhub-configurator no longer being usable on the staging hub.

Hook prePuller was enabled for a few hubs but not all. Since they put users on the the same nodes, and all hubs configure the same singleuser.image, it doesn't help to have more than one hook prePuller as long as it does its work before hubs are upgraded. By enabling hook prePuller on the staging hub, we accomplish this.

With this, the jupyterhub-configurator is disabled for the staging hub though which in this case is an undesired consequence as its really pulling the image for other hubs, possibly using the configurator.

UToronto

The hook prePuller was enabled on staging and prod hubs. Staging hubs benefit from the configurator, and the prod hubs doesn't. As long as hook prePuller is enabled on either staging or prod, they should be fine. This led to:

-	Staging	Prod
hook prePuller	now disabled	remains enabled
jupyterhub-configurator	remains enabled	now disabled

consideRatio · 2023-10-31T11:13:48Z

@sean-morris note that this PR is going to disable the jupyterhub-configurator on staging.cloudbank.2i2c.cloud, but you could test new images in another hub still I've presumed.

yuvipanda · 2023-10-31T13:03:46Z

What would the guidelines be for when one of the cloudbank hubs needs to be on a different image? Or if we're testing a different image on the utoronto staging hub before deploying to prod for an extended time (as we just did with ucmerced)?

yuvipanda · 2023-10-31T13:09:02Z

And wouldn't testing images that can possibly break on a non-staging hub be disruptive to those users, or require specially timing these tests to make sure they are not in use at that time?

config/clusters/2i2c/climatematch.values.yaml

helm-charts/basehub/values.schema.yaml

consideRatio · 2023-10-31T14:55:19Z

What would the guidelines be for when one of the cloudbank hubs needs to be on a different image? Or if we're testing a different image on the utoronto staging hub before deploying to prod for an extended time (as we just did with ucmerced)?

Cloudbank is a multi-community cluster with a shared node pool where one image is made special - the one many defaults to and defined in common.values.yaml. If one of the cloudbank hubs wants to use another image, I think they should do it using the configurator, and they don't have it updated by the hook prePuller as requires it to be set via chart config.

I think hook prePuller shouldn't be enabled for multi-community clusters using shared node pools, but cloudbank is an exception because it has a single image for many hubs making that stand out - its a multi-similar-community cluster using shared node pools.

If jupyterhub-configurator and hook prePuller is considered mandatory on the staging hub, then I think the resolution is to add the additional complexity of adding an override jupyterhub.custom.jupyterhubConfigurator.prePullerCanAlsoBeEnabled or similar.

And wouldn't testing images that can possibly break on a non-staging hub be disruptive to those users, or require specially timing these tests to make sure they are not in use at that time?

Yes it would, but that is the situation independent of using hook prePuller for all shared clusters community hubs right - unless they have the "other" option as well in their hub.

yuvipanda · 2023-10-31T15:09:43Z

Would you be able to describe what the policy for 'should hook pre-puller be enabled for this hub' be? What would someone setting up a new hub do to determine if this should be enabled or not? I think having that be written down and possibly be documented will help me, rather than discussing purely from examples.

GeorgianaElena · 2023-10-31T15:42:11Z

In the context of #3095 I would really like to to move towards empowering communities to use their staging hubs to be able to test images before deploying them to their users and without our intervention every time.

That said, I wouldn't like us to compromise this for cloudbank.

I think having that be written down and possibly be documented will help me, rather than discussing purely from examples.

I agree with @yuvipanda that documenting this somewhere together with this PR will make more sense and will close the circle. Without docs, we just move from an undocumented use of prePuller to an undocumented not use of it if that makes sense.

Because I see this as a "feature", I would document it in the howto of features https://infrastructure.2i2c.org/howto/features/

yuvipanda · 2023-11-03T14:04:15Z

I also just learnt about https://z2jh.jupyter.org/en/stable/resources/reference.html#prepuller-pullprofilelistimages, which seems to indicate that the pre puller also picks up images from profile_list! Is that true? If so, does this mean that the only time the hook prePuller is completely useless when we are using the configurator?

consideRatio · 2023-11-03T14:47:22Z

If so, does this mean that the only time the hook prePuller is completely useless when we are using the configurator?

The hook prePuller can be both helpful and pointless, and both the cause of issues and not, depending on situations. My main concern isn't about it being pointless sometimes, but that its the cause of more issues than its helpful. Besides that, use of prePuller is adding complexity as a guaranteed added negative.

This is a very complicated topic I think. I'm not sure I can find a clear motivation for when to enable it, or clarify the motivation well enough without a significant time investment. @yuvipanda are you okay with letting the scope of this PR to be to just constraining the use of hook prePuller away from some situations, as compared to also defining exactly what situations to use it in?

If there is a week of effort available for me to focus on that, I can get it done with some z2jh docs improvements I think, but I expect the balancing act of specifying exactly when to use it would take a lot of additional work for me.

consideRatio · 2023-11-03T14:50:40Z

I also just learnt about https://z2jh.jupyter.org/en/stable/resources/reference.html#prepuller-pullprofilelistimages, which seems to indicate that the pre puller also picks up images from profile_list! Is that true?

Yes, both variants pull the same set of images, and it are images from the chart config under various sources including singleuser.profileList and the kubespawner_override it defines. This excludes those under profile_options currently, but that is considered by jupyterhub/zero-to-jupyterhub-k8s#3217 that isn't yet merged.

There is also the functionality to not start the hook prePuller unless one image to pull has changed, making the hook prePuller only pull conditionally by default.

yuvipanda · 2023-11-03T15:01:50Z

@yuvipanda are you okay with letting the scope of this PR to be to just constraining the use of hook prePuller away from some situations

I think this PR as is adds a lot of assumptions into our infrastructure - such as staging and production hubs having the same image always, cloudbank not able to use staging to test images, etc. I am not sure what constraining it to be simpler would look like. I am happy to review it again if you are able to constrain it more.

I do have a suggested policy i am happy to share if so desired.

yuvipanda · 2023-11-03T15:02:56Z

If you are able to document why the prePuller should not be used in some specific situation, and constrain the pr to that documentation + implementing that, that is fine too.

yuvipanda · 2023-11-03T15:04:31Z

If there is a week of effort available for me to focus on that, I can get it done with some z2jh docs improvements I think, but I expect the balancing act of specifying exactly when to use it would take a lot of additional work for me.

I definitely don’t think this is a good use of your time right now. Let’s try to find something that works for the 90% use case than burn ourselves out in search for perfection

yuvipanda · 2023-11-03T15:11:31Z

If you think this is too complex to work on right now, i think it is also ok to park it and come back to it later! Please don’t burn yourself out!

It can be useful to let the staging hubs still provide the configurator, so we enable the hook image puller on the prod hubs specifically instead where the configurator is disabled.

consideRatio · 2023-11-13T15:45:28Z

I do have a suggested policy i am happy to share if so desired.

@yuvipanda I'd like to not work this further async due to the complexity of the topic and my struggles communicating async about complex topics. If you want, I'd really appreciate listening to and chatting with you about your policy idea in a meet!

I'll go for a close on this PR for now, it could always be re-opened or stripped for parts later anyhow.

damianavila · 2023-11-15T15:09:35Z

@consideRatio @yuvipanda and @GeorgianaElena, I am happy about how this discussion evolved.
I know it might be a little bit frustrating to close a PR (instead of merging it), but I also think we got a LOT more from the exchanges we had here even more with the additional complexity of an async conversation about a non-trivial topic.
Congrats to all on the exchange and the achieved resolution!

github-actions bot assigned consideRatio Oct 30, 2023

This comment was marked as resolved.

Sign in to view

consideRatio mentioned this pull request Oct 30, 2023

Stop using continuous prePuller #3313

Merged

consideRatio force-pushed the pr/stop-hook-pulling branch from a934593 to 6b995f3 Compare October 31, 2023 10:36

consideRatio marked this pull request as ready for review October 31, 2023 11:07

consideRatio requested a review from a team as a code owner October 31, 2023 11:07

consideRatio requested a review from yuvipanda October 31, 2023 11:13

GeorgianaElena reviewed Oct 31, 2023

View reviewed changes

config/clusters/2i2c/climatematch.values.yaml Outdated Show resolved Hide resolved

helm-charts/basehub/values.schema.yaml Outdated Show resolved Hide resolved

This was referenced Oct 31, 2023

basehub: make use of configurator opt-out #3348

Merged

Come up with when the jupyterhub-configurator is to be disabled #3349

Closed

consideRatio mentioned this pull request Nov 3, 2023

Disable hook prePuller for three communites using the jupyterhub configurator #3371

Merged

consideRatio marked this pull request as draft November 5, 2023 18:35

consideRatio added 3 commits November 13, 2023 11:46

basehub: prohibit use of prePuller.hook with jupyterhub-configurator

7ab68c1

cloudbank: use prePuller.hook on staging hub but no other hub

db11a6f

utoronto: disable hook prePuller and configurator on prod hubs

f1b8517

It can be useful to let the staging hubs still provide the configurator, so we enable the hook image puller on the prod hubs specifically instead where the configurator is disabled.

consideRatio force-pushed the pr/stop-hook-pulling branch from 6b995f3 to f1b8517 Compare November 13, 2023 10:47

consideRatio closed this Nov 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Constrain use of hook prePuller #3341

Constrain use of hook prePuller #3341

consideRatio commented Oct 30, 2023 •

edited

Loading

This comment was marked as resolved.

consideRatio commented Oct 31, 2023

yuvipanda commented Oct 31, 2023

yuvipanda commented Oct 31, 2023

consideRatio commented Oct 31, 2023

yuvipanda commented Oct 31, 2023

GeorgianaElena commented Oct 31, 2023

yuvipanda commented Nov 3, 2023

consideRatio commented Nov 3, 2023

consideRatio commented Nov 3, 2023 •

edited

Loading

yuvipanda commented Nov 3, 2023

yuvipanda commented Nov 3, 2023

yuvipanda commented Nov 3, 2023

yuvipanda commented Nov 3, 2023

consideRatio commented Nov 13, 2023

damianavila commented Nov 15, 2023

Constrain use of hook prePuller #3341

Constrain use of hook prePuller #3341

Conversation

consideRatio commented Oct 30, 2023 • edited Loading

Opt-out of jupyterhub-configurator

Hub specific changes

Cloudbank

UToronto

This comment was marked as resolved.

consideRatio commented Oct 31, 2023

yuvipanda commented Oct 31, 2023

yuvipanda commented Oct 31, 2023

consideRatio commented Oct 31, 2023

yuvipanda commented Oct 31, 2023

GeorgianaElena commented Oct 31, 2023

yuvipanda commented Nov 3, 2023

consideRatio commented Nov 3, 2023

consideRatio commented Nov 3, 2023 • edited Loading

yuvipanda commented Nov 3, 2023

yuvipanda commented Nov 3, 2023

yuvipanda commented Nov 3, 2023

yuvipanda commented Nov 3, 2023

consideRatio commented Nov 13, 2023

damianavila commented Nov 15, 2023

consideRatio commented Oct 30, 2023 •

edited

Loading

consideRatio commented Nov 3, 2023 •

edited

Loading