Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor ReconcileNetwork #2040

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dharaneeshvrd
Copy link
Contributor

@dharaneeshvrd dharaneeshvrd commented Nov 7, 2024

What this PR does / why we need it:

  • If only IBMPowerVSCluster.Spec.Network is set, network would be validated and if exists already will get used as cluster’s network or a new network will be created via DHCP service.
  • If only IBMPowerVSCluster.Spec.DHCPServer is set, DHCP server would be validated and if exists already, will use DHCP server’s network as cluster network. If not a new DHCP service will be created and it’s network will be used.
  • If both IBMPowerVSCluster.Spec.Network & IBMPowerVSCluster.Spec.DHCPServer is set, network and DHCP server would be validated and if both exists already then network is belongs to given DHCP server or not would be validated.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #1933

Special notes for your reviewer:

/area provider/ibmcloud

  1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Release note:

Refactor ReconcileNetwork

@k8s-ci-robot k8s-ci-robot added the area/provider/ibmcloud Issues or PRs related to ibmcloud provider label Nov 7, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dharaneeshvrd
Once this PR has been reviewed and has the lgtm label, please assign prajyot-parab for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 7, 2024
Copy link

netlify bot commented Nov 7, 2024

Deploy Preview for kubernetes-sigs-cluster-api-ibmcloud ready!

Name Link
🔨 Latest commit 205a6e2
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-cluster-api-ibmcloud/deploys/67471c45a7a4910008ff5e0c
😎 Deploy Preview https://deploy-preview-2040--kubernetes-sigs-cluster-api-ibmcloud.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@dharaneeshvrd
Copy link
Contributor Author

/cc @Amulyam24 @Karthik-K-N

@dharaneeshvrd dharaneeshvrd force-pushed the validate-existing-dhcp branch 2 times, most recently from e39f7b8 to bbef168 Compare November 7, 2024 07:14
}
return network.NetworkID, nil

if nw != nil && !nw.DhcpManaged {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think nw.DhcpManaged is required? We are getting network id from status and we set network id to status only when we create DHCP server.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I can remove this check.

if err != nil {
return nil, err
}
if err := s.verifyDHCPServerRespectToNetwork(*networkID); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this forces the user to set the dhcp network only. so we should update the api spec to mention it.
Also never tried what happens if we pass public network id here, Not relaying on DHCP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I think we should use DHCP server only, during boot, CoreOS expects the ip from a DHCP server. I hope the case is same for vanilla capi cluster as well.
Shall I modify the API spec as part of this PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use Centos images in normal capi flow as well.
@mkumatag whats your thought. Should we make this restriction or do you foresee any other use cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, we should allow both dhcp based and regular public networks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, current code designed to handle only DHCP network. @Karthik-K-N We can create a story for this to handle later I believe. Wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, That can be done.

@dharaneeshvrd dharaneeshvrd force-pushed the validate-existing-dhcp branch 3 times, most recently from fb3990c to fac8446 Compare November 8, 2024 05:24
@@ -459,6 +463,14 @@ func (s *PowerVSClusterScope) SetStatus(resourceType infrav1beta2.ResourceType,
}
}

// GetNetworkID returns the Networkf id from status of IBMPowerVSCluster object. If it doesn't exist, returns nil.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// GetNetworkID returns the Networkf id from status of IBMPowerVSCluster object. If it doesn't exist, returns nil.
// GetNetworkID returns the Network id from status of IBMPowerVSCluster object. If it doesn't exist, returns nil.

// if the user has provided the already existing dhcp server name then there might exist network name
// with format DHCPSERVER<DHCPServer.Name>_Private , try fetching that
var networkName string
if s.DHCPServer() != nil && s.DHCPServer().Name != nil {
networkName = fmt.Sprintf("DHCPSERVER%s_Private", *s.DHCPServer().Name)
networkName = dhcpNetworkName(*s.DHCPServer().Name)
} else {
networkName = *s.GetServiceName(infrav1beta2.ResourceTypeNetwork)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this required? as we are already checking this condition above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please give me the ref? I think this is the only place where I am checking with cluster name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying, I understand it is required to preserve the order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, addressed your other comments. ptal.

s.V(3).Info("Checking if PowerVS network exists in IBM Cloud with ID", "networkID", *s.IBMPowerVSCluster.Spec.Network.ID)
network, err := s.IBMPowerVSClient.GetNetworkByID(*s.IBMPowerVSCluster.Spec.Network.ID)
// checkNetwork checks if network exists in cloud with given network's ID or name or dhcp server's ID or name mentioned in spec.
// Priority order: Network ID - Network name - DHCP server ID - DHCP server name - Network with cluster name.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Priority order: Network ID - Network name - DHCP server ID - DHCP server name - Network with cluster name.
// Priority order: Network ID > Network name > DHCP server ID > DHCP server name > Network with cluster name.

@dharaneeshvrd dharaneeshvrd force-pushed the validate-existing-dhcp branch 2 times, most recently from 3cc2ab9 to 8d6fe0c Compare November 11, 2024 10:10
@dharaneeshvrd dharaneeshvrd force-pushed the validate-existing-dhcp branch 2 times, most recently from 41a3a74 to 22012f8 Compare November 19, 2024 04:59
@dharaneeshvrd dharaneeshvrd changed the title Validate DHCP server and network status when using existing network Refactor ReconcileNetwork Nov 19, 2024
@dharaneeshvrd
Copy link
Contributor Author

@Karthik-K-N @Amulyam24 I have refactored this PR to handle plain network and DHCP network separately. PTAL.

Copy link
Contributor

@Amulyam24 Amulyam24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall flow looks neat and good, couple of questions

  1. Should we update this var name dhcpServerActive since we check for both network and dhcp?
  2. In a case where only network name or ID is set in spec, we do not check for associated dhcp servers right, is this to accommodate public network type?

return nil, err
}
s.SetStatus(infrav1beta2.ResourceTypeNetwork, infrav1beta2.ResourceReference{ID: dhcpServer.Network.ID, ControllerCreated: ptr.To(false)})
} else if *dhcpServer.Network.ID != *s.GetNetworkID() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can there occur a scenario where dhcpServer.Network is nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, but have added a nil check for dhcpServer.Network anyway, ptal.

@dharaneeshvrd
Copy link
Contributor Author

dharaneeshvrd commented Nov 25, 2024

Should we update this var name dhcpServerActive since we check for both network and dhcp?

Agree, We mainly check network's active status, so kept networkActive

In a case where only network name or ID is set in spec, we do not check for associated dhcp servers right, is this to accommodate public network type?

Yes that has become the goal of this PR, earlier I was assuming that network should belong to a DHCP server. Now it can be a separate entity, if user provides any field on spec.IBMPowerVSCluster.DHCPServer then only I am considering its a DHCP server's network. Otherwise default flow remains same where no existing details provided, it would create a DHCP server to setup the network.

Copy link
Contributor

@Amulyam24 Amulyam24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM.

one minor suggestion, in UT, can we also verify the ControllerCreated value wherever applicable?

/cc @Karthik-K-N

Copy link
Contributor

@Karthik-K-N Karthik-K-N left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReconcileNetwork is very compilcated logic, Thank you for taking this up.

Few comments otherwise LGTM. Can we also mention in high level as comments over ReconcileNetwork() function what we are doing.

return false, err
}
if dhcpServerID != nil {
s.V(3).Info("Found DHCP server in IBM Cloud", "DHCP Server ID", dhcpServerID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep uniform and improve our logging, lets use , dhcpServerID instead of "DHCP Server ID"

if err != nil {
return nil, err
}
return network.NetworkID, nil
if s.GetNetworkID() == nil {
if _, err := s.IBMPowerVSClient.GetNetworkByID(*dhcpServer.Network.ID); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to say that dhcpServer will be active by this time? If not the network may be nil.
I am just concerned about *dhcpServer.Network.ID

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are validating existing DHCP server, new DHCP server creation won't reach this block, but anyway I have handled this via nil check, PTAL.

} else {
networkName = *s.GetServiceName(infrav1beta2.ResourceTypeNetwork)
networkName = dhcpNetworkName(s.InfraCluster())
}

s.V(3).Info("Checking if PowerVS network exists in IBM Cloud with name", "name", networkName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log line should be updated to state that, we are trying to look for network from DHCPServer's network list

}

if s.Network().Name != nil {
s.V(3).Info("Checking if PowerVS network exists in IBM Cloud with name")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets log network name also.

- If only IBMPowerVSCluster.Spec.Network is set, network would be validated and if exists already will get used as cluster’s network or a new network will be created via DHCP service.
- If only IBMPowerVSCluster.Spec.DHCPServer is set, DHCP server would be validated and if exists already, will use DHCP server’s network as cluster network. If not a new DHCP service will be created and it’s network will be used.
- If both IBMPowerVSCluster.Spec.Network & IBMPowerVSCluster.Spec.DHCPServer is set, network and DHCP server would be validated and if both exists already then network is belongs to given DHCP server or not would be validated.
@dharaneeshvrd
Copy link
Contributor Author

@Karthik-K-N @Amulyam24 I have tried to address your comments, PTAL.

@k8s-ci-robot
Copy link
Contributor

@dharaneeshvrd: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-provider-ibmcloud-test 205a6e2 link true /test pull-cluster-api-provider-ibmcloud-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link
Contributor

@Karthik-K-N Karthik-K-N left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM lets fix the failing test. Thank you.

@@ -847,18 +859,32 @@ func (s *PowerVSClusterScope) createServiceInstance() (*resourcecontrollerv2.Res
return serviceInstance, nil
}

// ReconcileNetwork reconciles network.
// ReconcileNetwork reconciles network
// If only IBMPowerVSCluster.Spec.Network is set, network would be validated and if exists already will get used as cluster’s network or a new network will be created via DHCP service.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also mention what happens when both network and dhcpserver is not set, Currently that configuration is used the most.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/ibmcloud Issues or PRs related to ibmcloud provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Validate DHCP server status before reusing existing DHCP network
5 participants