You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Client Version: v1.28.10+rke2r1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.10+rke2r1
Tekton Pipeline version:
Output of tkn version or kubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'
v0.48.0
Operating System:
5.14.0-284.30.1.el9_2.x86_64
Expected Behavior
Regardless of state, logs are streamed to completion and then the process completes
Actual Behavior
Very rarely, the process streaming logs hangs forever.
Steps to Reproduce the Problem
I'm unable to provide consistent steps to reproduce the problem as it appears to be a very tight race. It should be reproducible simply by using the tekton CLI to tail the logs of a pipelinerun, however.
Additional Info
The process in which I typically encounter this issue embeds tektoncd/cli in another golang process. Currently, I've been able to reproduce this issue using tektoncd/cli version 0.39.1.
I took a goroutine dump of a process in this state, and discovered a deadlock in pkg/pods/pod.go:
The first goroutine is stuck here, attempting to send an event on a channel. I'll call this goroutine "A" in the future:
In pod.go's Wait function, a channel eventC is created along with a mutex mu. Both the channel and mutex are passed to watcher, which receives interesting events from Kubernetes. When an event is received, goroutine A acquires mutex mu and then sends an event to the eventC channel. goroutine B is waiting for events, and once it receives an event confirming the pod status, it will no longer receive events from eventC. goroutine B then acquires mutex mu and closes eventC.
In the event where two events are quickly received by the watcher, and the first event results in a non-nil first return from checkPodStatus, it is possible for deadlock to occur. goroutine B which receives events on eventC will not attempt to listen for any further events, and goroutine A will attempt to send an event on eventC while holding mutex mu. Neither goroutine can make progress in this situation.
The text was updated successfully, but these errors were encountered:
Versions and Operating System
Kubernetes version:
Output of
kubectl version
:Tekton Pipeline version:
Output of
tkn version
orkubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'
Expected Behavior
Regardless of state, logs are streamed to completion and then the process completes
Actual Behavior
Very rarely, the process streaming logs hangs forever.
Steps to Reproduce the Problem
I'm unable to provide consistent steps to reproduce the problem as it appears to be a very tight race. It should be reproducible simply by using the tekton CLI to tail the logs of a pipelinerun, however.
Additional Info
The process in which I typically encounter this issue embeds tektoncd/cli in another golang process. Currently, I've been able to reproduce this issue using tektoncd/cli version 0.39.1.
I took a goroutine dump of a process in this state, and discovered a deadlock in
pkg/pods/pod.go
:The first goroutine is stuck here, attempting to send an event on a channel. I'll call this goroutine "A" in the future:
the second goroutine is stuck here, attempting to acquire a lock held by the first goroutine. I'll call this goroutine "B" in the future:
In
pod.go
'sWait
function, a channeleventC
is created along with a mutexmu
. Both the channel and mutex are passed towatcher
, which receives interesting events from Kubernetes. When an event is received, goroutine A acquires mutexmu
and then sends an event to theeventC
channel. goroutine B is waiting for events, and once it receives an event confirming the pod status, it will no longer receive events fromeventC
. goroutine B then acquires mutexmu
and closeseventC
.In the event where two events are quickly received by the watcher, and the first event results in a non-nil first return from
checkPodStatus
, it is possible for deadlock to occur. goroutine B which receives events oneventC
will not attempt to listen for any further events, and goroutine A will attempt to send an event oneventC
while holding mutexmu
. Neither goroutine can make progress in this situation.The text was updated successfully, but these errors were encountered: