-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pso-csi-node pods restarts frequently #168
Comments
Hi, @rpatil72. Can you turn on the debug flag in the values.yaml, run the script here to collect the log and send to [email protected]? Thanks. |
Thanks @dsupure for quick response. |
Thank you @rpatil72 . We got the logs. Our support team is debugging the logs. We will contact you shortly once we find the issue. Thanks for your patience. |
Hi @zatricky : |
hello @rpatil72 @zatricky after investigation we probably need more info from you:
From the latest log I got from the email
|
Hi, @pure-jliao. Our org has sent in a support request. Note that I have similar errors to those @rpatil72 submitted. In our case the original cause was a hypervisor cluster failure affecting many unrelated hosts, including an entire kubernetes cluster making use of the storage. We had already recovered all core services. I was hoping that a resolution was found for @rpatil72 so that we could at least know where to look next before submitting that support request (which has of course already been sent by now). |
thanks @zatricky , FYI we do found something wrong related to this error log line on our side |
@rpatil72 : can you help to answer the questions from @pure-jliao ? Thanks. |
@pure-jliao please find the replies below:
|
@rpatil72 thanks for applying, one thing to note right now we haven't tested PSO 6 on k8s 1.21 and it's not officially supported yet. but not 100% sure it's caused by a newer and unstable version of k8s, so if possible do you have any chance to try a fresh install on k8s 1.20? meanwhile in your first post, only 2 node pods are restarting and all other PSO pods are healthy, if you could reach that state again, could you try to grab the logs again? it might be a little bit hard to get logs if the pods keep crashing. also could you try to run PSO without the explorer to rule out some other effects? and it still cannot be resolved we could schedule a call cc: @dsupure @dingyin |
Facing the same issue. bash-3.2$ helm list -n pure-pso
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
pure-pso pure-pso 1 2021-06-30 01:00:21.279272 +0200 CEST deployed pure-pso-6.1.1 6.1.1
bash-3.2$ kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-12T14:11:29Z", GoVersion:"go1.16.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-08T16:25:06Z", GoVersion:"go1.16.1", Compiler:"gc", Platform:"linux/amd64"}
bash-3.2$ Any updates? |
@smrutimandal sorry we haven't narrowed down the problem here, there could be a lot of things causing PSO pod failures. So please collect the logs and contact support team for more investigation. |
Hi, Thanks for getting back to me. I am in the process of getting my support contract in order. In the meantime, I found that that the following feature-gate defaults have changed between 1.20 and 1.21. Kube API Server v1.20: Kube API Server v1.21: Can these changes have any impact on the liveness-probes? Any help is greatly appreciated. Thanks |
@smrutimandal: As @pure-jliao mentioned, there are many possibilities and it's hard to troubleshoot your problem via github. Please contact pure support and we will create a support case for you. We will follow up there. Thank you. |
Hi @dsupure , I understand that. If compatibility is really an issue though, the https://github.com/purestorage/pso-csi#environments-supported section, assuring support for v1.17+, is a bit confusing. I will get in touch with support. Thanks |
1.21 is supposed to be supported now, do you have any update ? I get the same liveness probe errors since K8s Upgrade (1.19 is EoL), though the setup seems to work OK (no restart -to be monitored- and volumes get attached)
|
node-driver-registrar
pso-csi-container
The text was updated successfully, but these errors were encountered: