-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PWX-32881: [CSI] Add CSI socket auto-recover when deleted #2322
PWX-32881: [CSI] Add CSI socket auto-recover when deleted #2322
Conversation
csi/csi.go
Outdated
} | ||
|
||
logrus.Infof("Detected CSI socket deleted at path %s. Stopping CSI server", socketPath) | ||
s.Stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though the socket path was deleted, the process is still running. We need to stop the server and re-create it.
s.Stop() handles killing the net.Listener goroutine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the change is OK -- but I'm confused as why do we need it (did customer request it?)
Hold off on reviews, re-working this PR. |
cd65528
to
40512cf
Compare
827c136
to
598890d
Compare
|
||
// Start server | ||
logrus.Infof("Restarting CSI gRPC server at %s", socketPath) | ||
if err := s.Start(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of failing fast, can we continuously retry?
efc4836
to
36d4a96
Compare
Signed-off-by: Grant Griffiths <[email protected]>
36d4a96
to
636f7ce
Compare
Tested again after the PR feedback, still restarts the CSI server:
|
Signed-off-by: Grant Griffiths <[email protected]>
636f7ce
to
61374da
Compare
This pull request cannot be automatically cherry-picked to the target release branch. |
…rage#2322) Signed-off-by: Grant Griffiths <[email protected]>
…rage#2322) Signed-off-by: Grant Griffiths <[email protected]>
) Signed-off-by: Grant Griffiths <[email protected]>
What this PR does / why we need it:
Add CSI socket auto-recover when deleted.
Which issue(s) this PR fixes (optional)
Closes #
Special notes for your reviewer:
fsnotify
, but it added a lot of complexity and seemed overkill. It was also unreliable compared to checking periodically./var/lib/kubelet/plugins/pxd.portworx.com/csi.sock
. There is another UDS created by the kubelet registration system for registering the CSI node. That is at/var/lib/kubelet/plugins_registry/pxd.portworx.com-reg.sock
. A PX Pod restart is required to get that to re-register. Looking into how that can be automated too. This PR only fixes the issue of having to restart portworx to get the CSI Driver socket re-created.Test Results:
Delete socket on server:
CSI Server restarted:
socket re-created