Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare PodCIDRs in NodeRouteController #7034

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 19 additions & 7 deletions pkg/agent/controller/noderoute/node_route_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ import (
"fmt"
"net"
"net/netip"
"slices"
"sync"
"time"

Expand Down Expand Up @@ -548,21 +549,32 @@ func (c *Controller) addNodeRoute(nodeName string, node *corev1.Node) error {
return err
}
peerWireGuardPublicKey := node.Annotations[types.NodeWireGuardPublicAnnotationKey]
podCIDRStrs := getPodCIDRsOnNode(node)
if len(podCIDRStrs) == 0 {
// If no valid PodCIDR is configured in Node.Spec, return immediately.
return nil
}
Comment on lines +553 to +556
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the previous routes be deleted if there are? otherwise the following situation may cause an issue:

  1. Node A (uid 1) had a PodCIDR 10.0.0.0/24;
  2. Node A (uid 1) was deleted, PodCIDR 10.0.0.0/24 was released;
  3. Node B was created, PodCIDR 10.0.0.0/24 was allocated to Node B;
  4. Node A (uid 2) was created, no PodCIDR was allocated to it.

Node C observed two events: Node B was updated with PodCIDR 10.0.0.0/24, Node A was updated with empty PodCIDR.

When reconciling Node B, the controller won't create new route to 10.0.0.0/24 because of the following nodesHaveSamePodCIDR check.
When reconciling Node B, the controller won't delete old route to 10.0.0.0/24 because it returns immediately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right that's an issue.
I think I was assuming that a Node wouldn't stay without PodCIDR, and that it would be resolved, but I guess this situation can last if the cluster has run out of Node CIDRs.


nrInfo, installed, _ := c.installedNodes.GetByKey(nodeName)
// Route is already added for this Node and Node MAC, transport IP
// and WireGuard public key are not changed.
samePodCIDRs := func(podCIDRs []*net.IPNet) bool {
// There is no reason to worry about the ordering of the CIDRs here. In a dual-stack
// cluster, the CIDRs should always be ordered consistently by IP family. Even if
// this was not true, samePodCIDRs would just return false and we would end up
// re-installing the same routes (and the podCIDRs field in nodeRouteInfo would be
// updated).
return slices.EqualFunc(podCIDRStrs, podCIDRs, func(cidrStr string, cidr *net.IPNet) bool {
return cidr.String() == cidrStr
})
}
// Route is already added for this Node and Node MAC, transport IP, podCIDRs and WireGuard
// public key are not changed.
Comment on lines +569 to +570
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously it didn't need to delete stale routes first because the PodCIDR is used as key in routes/flows. Since it could change now, we should delete stale routes first when it changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix that. Old flows should already be deleted because our OF client uses the node Name as the key (and I think we handle this case already). But there is definitely an issue with routes as they do use the PodCIDR as the key.

if installed && nrInfo.(*nodeRouteInfo).nodeMAC.String() == peerNodeMAC.String() &&
peerNodeIPs.Equal(*nrInfo.(*nodeRouteInfo).nodeIPs) &&
samePodCIDRs(nrInfo.(*nodeRouteInfo).podCIDRs) &&
nrInfo.(*nodeRouteInfo).wireGuardPublicKey == peerWireGuardPublicKey {
return nil
}

podCIDRStrs := getPodCIDRsOnNode(node)
if len(podCIDRStrs) == 0 {
// If no valid PodCIDR is configured in Node.Spec, return immediately.
return nil
}
klog.InfoS("Adding routes and flows to Node", "Node", nodeName, "podCIDRs", podCIDRStrs,
"addresses", node.Status.Addresses)

Expand Down
50 changes: 49 additions & 1 deletion pkg/agent/controller/noderoute/node_route_controller_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -49,11 +49,13 @@ var (
// podCIDRs of "local" Node
_, podCIDR, _ = net.ParseCIDR("1.1.0.0/24")
_, podCIDRv6, _ = net.ParseCIDR("2001:4860:0000::/48")
// podCIDRs of "remote" Node
// podCIDRs of "remote" Nodes
_, podCIDR1, _ = net.ParseCIDR("1.1.1.0/24")
_, podCIDR1v6, _ = net.ParseCIDR("2001:4860:0001::/48")
_, podCIDR2, _ = net.ParseCIDR("1.1.2.0/24")
podCIDR1Gateway = util.GetGatewayIPForPodCIDR(podCIDR1)
podCIDR1v6Gateway = util.GetGatewayIPForPodCIDR(podCIDR1v6)
podCIDR2Gateway = util.GetGatewayIPForPodCIDR(podCIDR2)
nodeIP1 = net.ParseIP("10.10.10.10")
dsIPs1 = utilip.DualStackIPs{IPv4: nodeIP1}
nodeIP2 = net.ParseIP("10.10.10.11")
Expand Down Expand Up @@ -198,6 +200,52 @@ func TestControllerWithDuplicatePodCIDR(t *testing.T) {
}
}

func TestNodeRecreateNewPodCIDR(t *testing.T) {
c := newController(t, &config.NetworkConfig{})
defer c.queue.ShutDown()

stopCh := make(chan struct{})
defer close(stopCh)
c.informerFactory.Start(stopCh)
c.informerFactory.WaitForCacheSync(stopCh)

// Remove IPv6 Pod CIDR to simplify test.
oldNode := node1.DeepCopy()
oldNode.UID = "old-uid"
oldNode.Spec.PodCIDR = podCIDR1.String()
oldNode.Spec.PodCIDRs = []string{podCIDR1.String()}

newNode := node1.DeepCopy()
newNode.UID = "new-uid"
newNode.Spec.PodCIDR = podCIDR2.String()
newNode.Spec.PodCIDRs = []string{podCIDR2.String()}

c.clientset.CoreV1().Nodes().Create(context.TODO(), oldNode, metav1.CreateOptions{})
c.ofClient.EXPECT().InstallNodeFlows("node1", gomock.Any(), &dsIPs1, uint32(0), nil).Times(1)
c.routeClient.EXPECT().AddRoutes(podCIDR1, "node1", nodeIP1, podCIDR1Gateway).Times(1)
c.processNextWorkItem()

// Node is deleted, then shortly after is "recreated" (same name) with a different PodCIDR.
c.clientset.CoreV1().Nodes().Delete(context.TODO(), oldNode.Name, metav1.DeleteOptions{})
// The sleep is for illustration purposes. The duration does not really matter. What matters
// is that by the time we call processNextWorkItem below, the informer store has been
// updated with the new Node.
time.Sleep(10 * time.Millisecond)
c.clientset.CoreV1().Nodes().Create(context.TODO(), newNode, metav1.CreateOptions{})

// At the moment, the test will fail because these expectations are not met.
c.ofClient.EXPECT().InstallNodeFlows("node1", gomock.Any(), &dsIPs1, uint32(0), nil).Times(1)
c.routeClient.EXPECT().AddRoutes(podCIDR2, "node1", nodeIP1, podCIDR2Gateway).Times(1)

require.Eventually(t, func() bool {
node, err := c.nodeLister.Get("node1")
return err == nil && node.UID == "new-uid"
}, 1*time.Second, 10*time.Millisecond)

// The workqueue may still be empty by the time this is called, but processNextWorkItem will block.
c.processNextWorkItem()
}

func TestLookupIPInPodSubnets(t *testing.T) {
c := newController(t, &config.NetworkConfig{})
defer c.queue.ShutDown()
Expand Down
Loading