Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installations going in Unknown Phase #287

Open
hemantkathuria opened this issue Oct 29, 2023 · 2 comments
Open

Installations going in Unknown Phase #287

hemantkathuria opened this issue Oct 29, 2023 · 2 comments
Labels
question Halp plz

Comments

@hemantkathuria
Copy link

hemantkathuria commented Oct 29, 2023

This is in continuation to the issue: #285

I tried adding installationServiceAccount: porter-agent in my customaction agent config.

After that all my installations are going in unknown phase and no pods are getting created. I did rollback the change but now no pods are getting created and I don't even see the unauthorized error.

What can be the issue and where do I see some logs which can help me troubleshooting this?

Name:         azuredep-10
Namespace:    unifieddeployment
Labels:       <none>
Annotations:  <none>
API Version:  getporter.org/v1
Kind:         Installation
Metadata:
  Creation Timestamp:  2023-10-29T11:29:41Z
  Finalizers:
    getporter.org/finalizer
  Generation:        1
  Resource Version:  571851
  UID:               af593a0f-ae60-47dc-86b1-7ead0f329975
Spec:
  Agent Config:
    Name:  customagent
  Bundle:
    Repository:  crporterpoc.azurecr.io/porter-hello
    Version:     v0.1.0
  Credential Sets:
    azurecredsetnew5
  Name:       azuredep-10
  Namespace:  unifieddeployment
  Parameters:
    Location:                EastUs2
    storage_account_name:    porterdemo10
    storage_container_name:  container001
    storage_rg:              porterdemo10
  Schema Version:            1.0.2
Status:
  Action:
    Name:               azuredep-10-bq26w
  Observed Generation:  1
  Phase:                Unknown
Events:                 <none>

I just noticed porter controller has gone to CrashLoopBackOff state. Please tell how do I bring it back to running state. I am not finding meaningful logs to bring it back in running state.

`kubectl logs  porter-operator-controller-manager-744d4cc48f-92466 -n porter-operator-system
Defaulted container "kube-rbac-proxy" out of: kube-rbac-proxy, manager
I1029 10:35:58.833293       1 main.go:186] Valid token audiences:
I1029 10:35:58.833604       1 main.go:232] Generating self signed cert as no cert is provided
I1029 10:35:59.422029       1 main.go:281] Starting TCP socket on 0.0.0.0:8443
I1029 10:35:59.422728       1 main.go:288] Listening securely on 0.0.0.0:8443

Some logs

10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z ERROR Reconciler error {"controller": "agentaction", "controllerGroup": "getporter.org", "controllerKind": "AgentAction", "AgentAction": {"name":"hello-llama5-wwds4","namespace":"unifieddeployment"}, "namespace": "unifieddeployment", "name": "hello-llama5-wwds4", "reconcileID": "aeabfc95-5d36-4b31-8018-fb8d0f7b8631", "error": "resolved agent configuration is not ready to be used. Waiting for the next retry", "errorVerbose": "resolved agent configuration is not ready to be used. Waiting for the next retry\nget.porter.sh/operator/controllers.(*AgentActionReconciler).resolveAgentConfig\n\t/workspace/controllers/agentaction_controller.go:535\nget.porter.sh/operator/controllers.(*AgentActionReconciler).runPorter\n\t/workspace/controllers/agentaction_controller.go:185\nget.porter.sh/operator/controllers.(*AgentActionReconciler).Reconcile\n\t/workspace/controllers/agentaction_controller.go:95\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594"} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z INFO controllers.Installation Copied status from agent action {"installation": "azuredep-66", "namespace": "unifieddeployment", "resourceVersion": "562522", "generation": 1, "observedGeneration": 1, "agentaction": "azuredep-66-ngwtn", "action": "azuredep-66-ngwtn", "phase": "Failed", "conditions": ["Scheduled", "Started", "Failed"]} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z DEBUG controllers.Installation Reconciliation complete: A porter agent has already been dispatched. {"installation": "azuredep-66", "namespace": "unifieddeployment", "resourceVersion": "562522", "generation": 1, "observedGeneration": 1, "agentaction": "azuredep-66-ngwtn"} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z INFO controllers.Installation Reconciling installation {"installation": "azuredep-36", "namespace": "unifieddeployment", "resourceVersion": "623510", "generation": 1, "observedGeneration": 1} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z DEBUG controllers.Installation Found existing agent action {"installation": "azuredep-36", "namespace": "unifieddeployment", "resourceVersion": "623510", "generation": 1, "observedGeneration": 1, "agentaction": "azuredep-36-zv9lf", "namespace": "unifieddeployment"} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z INFO controllers.Installation Syncing AgentAction status with Installation {"installation": "azuredep-36", "namespace": "unifieddeployment", "resourceVersion": "623510", "generation": 1, "observedGeneration": 1, "agentaction": "azuredep-36-zv9lf"} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z INFO controllers.Installation Copied status from agent action {"installation": "azuredep-36", "namespace": "unifieddeployment", "resourceVersion": "623510", "generation": 1, "observedGeneration": 1, "agentaction": "azuredep-36-zv9lf", "action": "azuredep-36-zv9lf", "phase": "Unknown", "conditions": []} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z INFO controllers.Installation Patching installation status {"installation": "azuredep-36", "namespace": "unifieddeployment", "resourceVersion": "623510", "generation": 1, "observedGeneration": 1, "agentaction": "azuredep-36-zv9lf"} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z DEBUG controllers.AgentConfig Applied patch {"agent config": "customagent", "namespace": "unifieddeployment", "resourceVersion": "573420", "generation": 3, "observedGeneration": 3, "status": false, "data": "{}"} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z INFO controllers.AgentConfig Creating porter agent action {"agent config": "customagent", "namespace": "unifieddeployment", "resourceVersion": "573420", "generation": 3, "observedGeneration": 3, "status": false} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d 2023-10-29T16:38:25Z INFO Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference {"controller": "agentconfig", "controllerGroup": "getporter.org", "controllerKind": "AgentConfig", "AgentConfig": {"name":"customagent","namespace":"unifieddeployment"}, "namespace": "unifieddeployment", "name": "customagent", "reconcileID": "c9f68e27-7be1-4d4a-b0f3-e2280646fa0d"} 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d panic: runtime error: invalid memory address or nil pointer dereference [recovered] 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d panic: runtime error: invalid memory address or nil pointer dereference 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x146bdf5] 10/29/2023, 10:08:25 PM porter-operator-controller-manager-744d4cc48f-92466 630b027287b75c79dfc5a9a2b7594810637e680ef6d8f920536b4025744f904d
`

@hemantkathuria hemantkathuria added the question Halp plz label Oct 29, 2023
@hemantkathuria
Copy link
Author

hemantkathuria commented Oct 31, 2023

Please guide here. It will be a great help. @troy0820 @sgettys @schristoff

@schristoff
Copy link
Member

Hi @hemantkathuria - apologies that it's taken me a bit to look at this. I want you to know I'm going to reviewing this and will get back to you soon with either follow up questions or a path forward. Thank you for your patience and your amazing in depth issues :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Halp plz
Projects
None yet
Development

No branches or pull requests

2 participants