Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

smt_via_label via machine config #47

Closed

Conversation

schabrolles
Copy link
Contributor

Fixes: #46
Signed-off-by: Sebastien Chabrolles [email protected]

@@ -38,6 +38,10 @@
when: powervm_rmc
import_tasks: powervm_rmc.yaml

- name: Allow SMT configuration via node label (SMT=X)
when: ansible_architecture == "ppc64le"
import_tasks: smt_configuration.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this task optional. I'm not very comfortable fiddling with SMT settings and would like to keep this optional and run through some iterations of e2e tests to verify stability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean adding a variable smt_control (true|false) in the playbook ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and default to false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in my last commit

done
let INITOFFTHREAD=$INITOFFTHREAD+$MAXSMT
done
systemctl restart kubelet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are rebooting the node, we should avoid restarting kubelet

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, is this setting persistent when done in this way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it is persistent as it seems to start before kubelet.
Do you think it is really dangerous to restart kubelet ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how MCO will behave if kubelet is restarted. Additionally since anyway node reboot will happen post application of the machine config then why to restart kubelet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MCO deploy a systemd service which start a script that monitor every 30s the status of the SMT label for the current node.
If this label is set (or changed) the script change the SMT mode on the current node and restart kubelet to update the CPU resource count without rebooting the node (which is REALLY long on powerVM).
So the main idea is to avoid a reboot when changing mode. (and because I’m 90% sure that users will always forget to reboot the node to refresh the kubelet)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is kubelet, there is crio and then there is impact to the containers that are already running. Some scenarios that I can think of and have no idea how will it get handled if kubelet is restarted.

  1. number of logical cpus < total cpu requests for PODs
  2. number of logical cpus < total cpu limits for PODs

I agree reboot will be longer process. However according to me the safest way to set the SMT value during install time (as a oneshot service) via machineconfig and letting MCO take care of the reboot.

Let's think through this and get some datapoints. The base code as part of this PR is good start, it's just finding the right way and agreeing to it.

echo "Target SMT must be smaller or equal than Maximum SMT supported"
fi
fi
/bin/sleep 30
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is not needed ? sleep 30 is here because of the while loop...
script start and loop forever, checking node SMT label every 30s.

@bpradipt
Copy link
Contributor

@schabrolles the commits will need squashing to make it a single commit :-)

@schabrolles
Copy link
Contributor Author

Yes, sure I will do that soon... just finishing testing and adding modification from your comments

@ppc64le-cloud-bot
Copy link
Contributor

ppc64le-cloud-bot commented Jun 20, 2022

@schabrolles: PR is not mergeable.

The PR state is: DIRTY

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

1 similar comment
@Rajalakshmi-Girish
Copy link
Contributor

@schabrolles: PR is not mergeable.

The PR state is: DIRTY

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Rajalakshmi-Girish
Copy link
Contributor

/meow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Changing SMT level based on node label
4 participants