Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: one NIC's IP pool shortage depleted IPs of other NICs in a multi-NIC setup. #4379

Merged
merged 1 commit into from
Dec 11, 2024

Conversation

ty-dc
Copy link
Collaborator

@ty-dc ty-dc commented Dec 11, 2024

Thanks for contributing!

Notice:

What issue(s) does this PR fix:

Fixes #4295

Special notes for your reviewer:

问题现象:
在 pod 使用 多 underlay 网卡时,若第一张网卡分配 IP 成功,第二张网卡因为ip不足而分配失败,会导致 pod 重启,而在重启后,第一张网卡会再次分配新的 ip .... 如此往复,会导致第一张网卡的 ip 池 ip 耗尽,同一个 Pod 的 UID 在池中被重复记录,导致该池资源被耗尽,无法为其他 Pod 提供地址分配。

kubectl get sp default-v4-ippool -ojsonpath={.status.allocatedIPs} | jq 
{
  "172.18.40.10": {
    "pod": "default/ippool-test-app-58b7766c7c-mvwpf",
    "podUid": "4009cb83-bd17-4b08-92f0-2ea0cd3d64fc"
  },
  "172.18.40.11": {
    "pod": "default/ippool-test-app-58b7766c7c-5jp5d",
    "podUid": "4e57d4ee-fd3f-44fa-8c80-1db2c6ae0e98"
  },
  "172.18.40.12": {
    "pod": "default/ippool-test-app-58b7766c7c-gw8hd",
    "podUid": "4102bd95-8f63-4f66-9b64-d7050e91b6c5"
  },
  "172.18.40.13": {
    "pod": "default/ippool-test-app-58b7766c7c-gw8hd",
    "podUid": "4102bd95-8f63-4f66-9b64-d7050e91b6c5"
  },
  "172.18.40.19": {
    "pod": "default/ippool-test-app-58b7766c7c-gw8hd",
    "podUid": "4102bd95-8f63-4f66-9b64-d7050e91b6c5"
  },
  "172.18.40.20": {
    "pod": "default/ippool-test-app-58b7766c7c-gw8hd",
    "podUid": "4102bd95-8f63-4f66-9b64-d7050e91b6c5"
  }
}

原因:
在多网卡场景下,根据网卡依次调用 IPAM 为每张网卡进行地址分配。如 Pod 拥有网卡 eth0、net1。当 Pod 启动,先为 eth0 进行 IP 地址分配,然后为 net1 网卡进行地址分配。如果此时 net1 所使用的 IP 池地址资源不够,将分配 IP 地址失败,然后 IPAM 会进行循环重试分配,又会从 eth0 开始进行分配,故同一个 Pod 的 eth0 网卡会被 IP 池多次分配。直到 IP 池地址耗尽。

解决:

检查 IP 池中已分配的 IP 情况,如果对应的 UID 和当前需要分配的 Pod 的 UID 匹配,证明 Pod 的某张卡已经在该池中拿到了 IP 地址,无需继续分配,返回当前拥有的 IP 地址,以此避免重复分配导致 IP 地址被耗尽的问题。

…of other NICs in a multi-NIC setup.

Signed-off-by: tao.yang <[email protected]>
@ty-dc ty-dc added release/bug cherrypick-release-v0.8 Cherry-pick the PR to branch release-v0.8. cherrypick-release-v0.9 cherrypick-release-v1.0 Cherry-pick the PR to branch release-v1.0. labels Dec 11, 2024
Copy link

codecov bot commented Dec 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.85%. Comparing base (e3b8408) to head (834562d).
Report is 34 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4379      +/-   ##
==========================================
+ Coverage   79.56%   79.85%   +0.28%     
==========================================
  Files          54       54              
  Lines        6362     6369       +7     
==========================================
+ Hits         5062     5086      +24     
+ Misses       1103     1082      -21     
- Partials      197      201       +4     
Flag Coverage Δ
unittests 79.85% <100.00%> (+0.28%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
pkg/ippoolmanager/ippool_manager.go 84.61% <100.00%> (-2.03%) ⬇️

... and 1 file with indirect coverage changes

@weizhoublue weizhoublue changed the title Fix: Resolved an issue where one NIC's IP pool shortage depleted IPs of other NICs in a multi-NIC setup. Fix: one NIC's IP pool shortage depleted IPs of other NICs in a multi-NIC setup. Dec 11, 2024
@weizhoublue weizhoublue merged commit f9feb6b into spidernet-io:main Dec 11, 2024
58 checks passed
// Check if there is a duplicate Pod UID in IPPool.allocatedRecords.
// If so, we skip this allocation and assume that this Pod has already obtained an IP address in the pool.
if record.PodUID == string(pod.UID) {
logger.Sugar().Warnf("The Pod %s/%s UID %s already exists in the assigned IP %s", pod.Namespace, pod.Name, ip, string(pod.UID))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉 info 级别就可以

github-actions bot pushed a commit that referenced this pull request Dec 11, 2024
Fix:  one NIC's IP pool shortage depleted IPs of other NICs in a multi-NIC setup.
Signed-off-by: robot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherrypick-release-v0.8 Cherry-pick the PR to branch release-v0.8. cherrypick-release-v0.9 cherrypick-release-v1.0 Cherry-pick the PR to branch release-v1.0. release/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CNI loading failure causing spiderIPpool ip exhaustion
3 participants