-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add arguments to pass Ray cluster head and worker templates #570
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one comment, which pertains to the scope of what the user can provide. I think it might require more in depth discussion.
|
||
def apply_worker_template(cluster_yaml: dict, worker_template: client.V1PodTemplateSpec): | ||
worker = cluster_yaml.get("spec").get("workerGroupSpecs")[0] | ||
merge(worker["template"], worker_template.to_dict(), strategy=Strategy.ADDITIVE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How should a user edit the container spec? Specifying partial container spec will only add a new partial entry to the containers
list. This becomes an issue when specifying something like extra volume mounts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, ideally we'd want to leverage strategic merge patch for that: https://kubernetes.io/docs/tasks/manage-kubernetes-objects/update-api-object-kubectl-patch/#notes-on-the-strategic-merge-patch. It's easy to do in Go by using https://github.com/kubernetes/apimachinery/blob/master/pkg/util/strategicpatch/patch.go, but it don't know if that's possible in Python.
With mergedeep
and the ADDITIVE
strategy, lists are not replaced, but elements appended. The problem being that something like:
cluster = Cluster(ClusterConfiguration(
head_template=V1PodTemplateSpec(
spec=V1PodSpec(
containers=[
V1Container(name="ray-head", volume_mounts=[
V1VolumeMount(name="config", mount_path="path"),
])
],
volumes=[
V1Volume(name="config", config_map=V1ConfigMapVolumeSource(name="config")),
],
)
),
It appends an entire new container, while it should only append the extra volume mount to the existing container.
Strategic merge patch solves this, by relying on information like patchMergeKey
in Go structs (x-kubernetes-patch-merge-key
in CRDs) so it knows how to match items by key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't seem to find anything for merging V1PodTemplate specs, only namespaced pods by sending API requests. There's this package https://pypi.org/project/jsonmerge/ which we could use, it would require maintaining a bit of redundant config however 😒.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I've actually stumbled upon jsonmerge
this morning as well. it seems it'd be possible to provide a merge strategy by key for lists. Let's give it a try? I agree we'd have to maintain some config, unless we figure a way to leverage Kubernetes JSON schema https://github.com/yannh/kubernetes-json-schema?tab=readme-ov-file, that do contain x-kubernetes-patch-merge-key
information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@varshaprasad96 maybe you would have some ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One step better than just additive merging could be calculating the diff and then merging them (deepdiff and then deepmerge - with probably custom merging strategies if needed). But this would still not solve the problem with conflicts at the very least. Looks like if we want to leverage JSON schema we either need to use a live client or load the config to guide merging process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a script that we could use as a starting point for generate a jsonmerge schema from a k8 schema. I haven't gotten around to testing it yet. The output looks right based on my understanding of the jsonmerge
package documentation, not based on testing done with actual objects. Sorry for the messiness
def merge_schema_from_k8_schema(k8_schema):
to_return = {}
k8_properties = {}
if "array" in k8_schema.get("type", []) and k8_schema.get("x-kubernetes-list-type") != "atomic":
to_return["items"] = {}
to_return["items"]["properties"] = {}
toret_properties = to_return["items"]["properties"]
k8_properties = k8_schema.get("items", {}).get("properties", {})
elif "object" in k8_schema.get("type", []):
to_return["properties"] = {}
toret_properties = to_return["properties"]
k8_properties = k8_schema.get("properties", {})
for key, value in k8_properties.items():
if "object" in value.get("type", []) and "properties" in value:
toret_properties[key] = merge_schema_from_k8_schema(value)
elif "array" in value.get("type", []) and "items" in value:
toret_properties[key] = merge_schema_from_k8_schema(value)
else:
toret_properties[key] = {"mergeStrategy": "overwrite"}
if "array" in k8_schema.get("type", []):
if k8_schema.get("x-kubernetes-list-type") == "set":
to_return["mergeStrategy"] = "arrayMergeById"
to_return["idRef"] = "/"
elif k8_schema.get("x-kubernetes-list-type") == "atomic":
to_return["mergeStrategy"] = "overwrite"
elif k8_schema.get("x-kubernetes-list-type") == "map":
if k8_schema.get("x-kubernetes-patch-merge-key"):
to_return["mergeStrategy"] = "arrayMergeById"
to_return["id"] = k8_schema["x-kubernetes-patch-merge-key"]
else:
to_return["mergeStrategy"] = "overwrite"
elif "object" in k8_schema.get("type", []):
to_return["mergeStrategy"] = "objectMerge"
else:
to_return["mergeStrategy"] = "overwrite"
return to_return
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Issue link
Fixes #413.
What changes have been made
This PR adds parameters to the Ray cluster configuration API, so users can provide Pod template for the head and worker nodes, e.g.:
Verification steps
Checks