Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(config): Prevent changing bootstrap charm config #319

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

HomayoonAlimohammadi
Copy link
Contributor

@HomayoonAlimohammadi HomayoonAlimohammadi commented Feb 10, 2025

Overview

This PR adds a way to prevent the bootstrap- charm configs from getting changed after the cluster is bootstrapped.

Rationale

Some of the charm config options start with bootstrap- which means that these options should only be specified upon deploying the charm (before the underlying Kubernetes cluster is bootstrapped) and changing them afterwards might result in unexpected behavior. We've discussed that we like to have a mechanism to prevent the user from changing these options.

@HomayoonAlimohammadi HomayoonAlimohammadi requested a review from a team as a code owner February 10, 2025 07:24
@HomayoonAlimohammadi HomayoonAlimohammadi marked this pull request as draft February 10, 2025 07:42
@HomayoonAlimohammadi HomayoonAlimohammadi force-pushed the KU-2306/prevent-changing-bootstrap-config branch from 89cf36e to fd1c0c4 Compare February 10, 2025 12:35
@HomayoonAlimohammadi HomayoonAlimohammadi changed the base branch from main to fix-formatting February 10, 2025 12:35
github-actions[bot]

This comment was marked as resolved.

@HomayoonAlimohammadi HomayoonAlimohammadi marked this pull request as ready for review February 10, 2025 12:49
Base automatically changed from fix-formatting to main February 10, 2025 14:31
@HomayoonAlimohammadi HomayoonAlimohammadi force-pushed the KU-2306/prevent-changing-bootstrap-config branch from b623f9f to 3e07cd0 Compare February 10, 2025 14:35
@HomayoonAlimohammadi HomayoonAlimohammadi marked this pull request as draft February 11, 2025 11:52
@HomayoonAlimohammadi HomayoonAlimohammadi force-pushed the KU-2306/prevent-changing-bootstrap-config branch 3 times, most recently from 8d0af0e to 4483690 Compare February 11, 2025 13:44
@HomayoonAlimohammadi HomayoonAlimohammadi marked this pull request as ready for review February 11, 2025 13:57
@HomayoonAlimohammadi HomayoonAlimohammadi marked this pull request as draft February 12, 2025 12:48
@HomayoonAlimohammadi
Copy link
Contributor Author

Converting this draft as we need to implement the k8sd call for bootstrap config first.

@HomayoonAlimohammadi HomayoonAlimohammadi force-pushed the KU-2306/prevent-changing-bootstrap-config branch from 4483690 to ab1e09a Compare February 21, 2025 13:53
@HomayoonAlimohammadi HomayoonAlimohammadi marked this pull request as ready for review February 21, 2025 14:25
@HomayoonAlimohammadi HomayoonAlimohammadi changed the title charm: Add bootstrap config change preventer feat(config): Prevent changing bootstrap charm config Feb 21, 2025
@@ -5,9 +5,9 @@ amd64:
- name: k8s
install-type: store
classic: true
channel: latest/edge
channel: latest/edge/hue-cluster-config
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for testing purposes, needs to be removed before merge.

Copy link
Contributor

@addyess addyess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, i have a few questions.
I'd also like @mateoflorido to weigh in here too

Comment on lines 89 to 93
default: dqlite
default: k8s-dqlite
type: string
description: |
The datastore to use in Canonical Kubernetes. This cannot be changed
after deployment. Allowed values are "dqlite" and "etcd". If "etcd" is
after deployment. Allowed values are "k8s-dqlite" and "etcd". If "etcd" is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can just change this. charms deployed from 1.32 will deploy with dqlite as their default. Upgrading to this revision will cause an immediate failure.

The default config from 1.32 was dqlite, so that's what we are "stuck with"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Adam. Also, I think this reflect better that we are using dqlite as the datastore. The translation mechanism is something that we can hide internally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I also thought that this might be a breaking change, but here's what I think:
We use the dqlite string in the following places:

@@ -87,15 +87,31 @@ class InvalidResponseError(K8sdAPIManagerError):
code (int): HTTP Status code
"""

def __init__(self, code: int, msg: str) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a LIBPATCH version at the head of this file that if it gets changed, we must also bump

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I always forget this. Lemme create a card for adding a CI check for this.

Comment on lines 1357 to 1359
# NOTE(Hue): The way we're saving certificates has nothing to do with the snap
# so we can't rely on the snap to tell us if the certificates have changed.
self._stored.set_default(certificates=self.config.get(CONFIG_BOOTSTRAP_CERTIFICATES))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we're back to using stored-state?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can infer this from the certificates in the unit:

def check_ca_key():
    return os.path.isFile("/etc/kubernetes/pki/ca.key")

This way we can avoid using stored state

Comment on lines 1334 to 1363
"""Prevent bootstrap config changes after bootstrap."""
log.info("Preventing bootstrap config changes after bootstrap")

try:
cluster_config = bootstrap_config_options_from_cluster_config(
self.api_manager.get_cluster_config().metadata
)
except InvalidResponseError as e:
if e.code == HTTP_503_SERVICE_UNAVAILABLE:
log.info("k8sd is not ready, skipping bootstrap config check")
return
raise

# NOTE(Hue): The way we're saving certificates has nothing to do with the snap
# so we can't rely on the snap to tell us if the certificates have changed.
self._stored.set_default(certificates=self.config.get(CONFIG_BOOTSTRAP_CERTIFICATES))

if self.is_control_plane:
self._prevent_control_plane_bootstrap_config_change(cluster_config)
else:
self._prevent_worker_bootstrap_config_change(cluster_config)

def _prevent_control_plane_bootstrap_config_change(self, ref: BootstrapConfigOptions):
"""Prevent control-plane bootstrap config changes after bootstrap."""
changed: List[ChangedConfig] = []

if self.config.get(CONFIG_BOOTSTRAP_DATASTORE, "") != ref.datastore:
changed.append(
ChangedConfig(
name=CONFIG_BOOTSTRAP_DATASTORE,
cluster_config=ref.datastore,
charm_config=self.config.get(CONFIG_BOOTSTRAP_DATASTORE, ""),
)
)

# NOTE(Hue): bootstrap-certificates is not represented in the cluster config and should be
# checked against the stored value.
if self.config.get(CONFIG_BOOTSTRAP_CERTIFICATES, "") != self._stored.certificates:
changed.append(
ChangedConfig(
name=CONFIG_BOOTSTRAP_CERTIFICATES,
cluster_config=self._stored.certificates,
charm_config=self.config.get(CONFIG_BOOTSTRAP_CERTIFICATES, ""),
)
)

if self.config.get(CONFIG_BOOTSTRAP_NODE_TAINTS, "") != ref.node_taints:
changed.append(
ChangedConfig(
name=CONFIG_BOOTSTRAP_NODE_TAINTS,
cluster_config=ref.node_taints,
charm_config=self.config.get(CONFIG_BOOTSTRAP_NODE_TAINTS, ""),
)
)

if self.config.get(CONFIG_BOOTSTRAP_POD_CIDR, "") != ref.pod_cidr:
changed.append(
ChangedConfig(
name=CONFIG_BOOTSTRAP_POD_CIDR,
cluster_config=ref.pod_cidr,
charm_config=self.config.get(CONFIG_BOOTSTRAP_POD_CIDR, ""),
)
)

if self.config.get(CONFIG_BOOTSTRAP_SERVICE_CIDR, "") != ref.service_cidr:
changed.append(
ChangedConfig(
name=CONFIG_BOOTSTRAP_SERVICE_CIDR,
cluster_config=ref.service_cidr,
charm_config=self.config.get(CONFIG_BOOTSTRAP_SERVICE_CIDR, ""),
)
)

if changed:
for c in changed:
log.error(
"Bootstrap config '%s' should NOT be changed after bootstrap. %s", c.name, c
)
raise BootstrapConfigChangeError(changed)

def _prevent_worker_bootstrap_config_change(self, ref: BootstrapConfigOptions):
"""Prevent worker bootstrap config changes after bootstrap."""
changed: List[ChangedConfig] = []

if self.config.get(CONFIG_BOOTSTRAP_NODE_TAINTS, "") != ref.node_taints:
changed.append(
ChangedConfig(
name=CONFIG_BOOTSTRAP_NODE_TAINTS,
cluster_config=ref.node_taints,
charm_config=self.config.get(CONFIG_BOOTSTRAP_NODE_TAINTS, ""),
)
)

if changed:
for c in changed:
log.error(
"Bootstrap config '%s' should NOT be changed after bootstrap. %s", c.name, c
)
raise BootstrapConfigChangeError(changed)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could i talk you into moving all of this out of charm.py. Maybe bootstrap.py?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, charm.py is becoming a behemoth

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course you can :D

Comment on lines 89 to 93
default: dqlite
default: k8s-dqlite
type: string
description: |
The datastore to use in Canonical Kubernetes. This cannot be changed
after deployment. Allowed values are "dqlite" and "etcd". If "etcd" is
after deployment. Allowed values are "k8s-dqlite" and "etcd". If "etcd" is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Adam. Also, I think this reflect better that we are using dqlite as the datastore. The translation mechanism is something that we can hide internally.


status: UserFacingClusterConfig
datastore: Optional[UserFacingDatastoreConfig] = Field(default=None)
node_taints: Optional[List[str]] = Field(default=None, alias="nodeTaints")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This alias looks odd... Are we going to change the naming schema?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this in k8s-snap-api, k8s-snap and also here.

@@ -942,6 +990,8 @@ def create_connection(self):
class K8sdAPIManager:
"""Manager for K8sd API interactions."""

CLUSTER_CONFIG_ENDPOINT: str = "/1.0/k8sd/cluster/config"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we can keep it inline or create another PR to refactor all the other endpoints. It would be a good opportunity to improve the library, but I think it would make the PR harder to review.

Copy link
Contributor Author

@HomayoonAlimohammadi HomayoonAlimohammadi Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove this and create a card for the refactor.

Comment on lines 1357 to 1359
# NOTE(Hue): The way we're saving certificates has nothing to do with the snap
# so we can't rely on the snap to tell us if the certificates have changed.
self._stored.set_default(certificates=self.config.get(CONFIG_BOOTSTRAP_CERTIFICATES))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can infer this from the certificates in the unit:

def check_ca_key():
    return os.path.isFile("/etc/kubernetes/pki/ca.key")

This way we can avoid using stored state

@HomayoonAlimohammadi HomayoonAlimohammadi force-pushed the KU-2306/prevent-changing-bootstrap-config branch from ab1e09a to 77bfcb3 Compare February 24, 2025 09:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants