Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

Creating container DO NOT need runtime anymore in kata-shim-v2? #1170

Closed
BetaXOi opened this issue Jan 24, 2019 · 7 comments
Closed

Creating container DO NOT need runtime anymore in kata-shim-v2? #1170

BetaXOi opened this issue Jan 24, 2019 · 7 comments

Comments

@BetaXOi
Copy link

BetaXOi commented Jan 24, 2019

Since kata support containerd shimv2 API #572, creating a container via kata-shim-v2 do not need kata-runtime anymore, but i don't think this is a good idea. It breaks the containerd original architecture. see https://github.com/crosbymichael/dockercon-2016/blob/master/Creating%20Containerd.pdf

I did some test by kata-shim-v2, it will cause zombie process in container because kata-shim-v2 does not listen SIGCHLD and reap child process. The reason there is zombie in container maybe kata-agent not set CHILD SUBREAPER, So It is not associated with kata-shim-v2, my bad.

@lifupan
Copy link
Member

lifupan commented Jan 24, 2019

Hi @BetaXOi , Where is the zombie process, in host or qemu target? If in host, can you paste some details info about the zombie processes?

@BetaXOi
Copy link
Author

BetaXOi commented Jan 24, 2019

zombie process is in container, i will show my test detail below

  • create v1 and v2 container
# ningbo @ localhost in ~ [20:21:05] C:1                                                                                                              
$ sudo ctr run -t -d docker.io/library/alpine:latest v1 sleep 1d                                                                                      
                                                                                                                                                      
# ningbo @ localhost in ~ [20:21:12]                                                                                                                  
$ sudo ctr run -t -d --runtime io.containerd.kata.v2 docker.io/library/alpine:latest v2 sleep 1d                                                      
                                                                                                                                                      
# ningbo @ localhost in ~ [20:21:34]                                                                                                                  
$ sudo ctr t list                                                                                                                                     
TASK    PID      STATUS                                                                                                                               
v1      17809    RUNNING                                                                                                                              
v2      17895    RUNNING
  • create parent.sh and child.sh in v1 and v2 containers
# ningbo @ localhost in ~ [20:25:00]                                                                                                                  
$ sudo ctr t exec -t --exec-id 100 v1 sh                                                                                                              
/ # echo "sh ./child.sh" |tee parent.sh                                                                                                               
sh ./child.sh                                                                                                                                         
/ # echo "while true; do sleep 10; done" |tee child.sh                                                                                                
while true; do sleep 10; done                                                                                                                         
/ # chmod +x *.sh

(create parent.sh and child.sh in v2 also)

  • execute parent.sh in v1 and v2 container

v1:

/ # ./parent.sh &
/ # ps axf -o pid,ppid,stat,comm,args                                                                                                                 
PID   PPID  STAT COMMAND          COMMAND                                                                                                             
    1     0 S    sleep            sleep 1d                                                                                                            
   16     0 S    sh               sh                                                                                                                  
   33    16 S    busybox          {busybox} ash ./parent.sh                                                                                           
   34    33 S    sh               sh ./child.sh                                                                                                       
   41    34 S    sleep            sleep 10                                                                                                            
   43    16 R    ps               ps axf -o pid,ppid,stat,comm,args

v2:

/ # ./parent.sh &
/ # ps axf -o pid,ppid,stat,comm,args                                                                                                                 
PID   PPID  STAT COMMAND          COMMAND                                                                                                             
    1     0 S    sleep            sleep 1d                                                                                                            
    6     0 S    sh               sh                                                                                                                  
   16     6 S    busybox          {busybox} ash ./parent.sh                                                                                           
   17    16 S    sh               sh ./child.sh                                                                                                       
   21    17 S    sleep            sleep 10                                                                                                            
   24     6 R    ps               ps axf -o pid,ppid,stat,comm,args
  • kill parent.sh then kill child.sh, and check process state

v1:

/ # kill -9 33                                                                                                                                        
/ # ps axf -o pid,ppid,stat,comm,args                                                                                                                 
PID   PPID  STAT COMMAND          COMMAND                                                                                                             
    1     0 S    sleep            sleep 1d                                                                                                            
   16     0 S    sh               sh                                                                                                                  
   34     0 S    sh               sh ./child.sh                                                                                                       
   72    34 S    sleep            sleep 10                                                                                                            
   73    16 R    ps               ps axf -o pid,ppid,stat,comm,args                                                                                   
[1]+  Killed                     ./parent.sh                                                                                                          
/ # kill -9 34                                                                                                                                        
/ # ps axf -o pid,ppid,stat,comm,args                                                                                                                 
PID   PPID  STAT COMMAND          COMMAND                                                                                                             
    1     0 S    sleep            sleep 1d                                                                                                            
   16     0 S    sh               sh                                                                                                                  
   76    16 R    ps               ps axf -o pid,ppid,stat,comm,args

v2:

/ # kill -9 16                                                                                                                                        
/ # ps axf -o pid,ppid,stat,comm,args                                                                                                                 
PID   PPID  STAT COMMAND          COMMAND                                                                                                             
    1     0 S    sleep            sleep 1d                                                                                                            
    6     0 S    sh               sh                                                                                                                  
   17     1 S    sh               sh ./child.sh                                                                                                       
   52    17 S    sleep            sleep 10                                                                                                            
   53     6 R    ps               ps axf -o pid,ppid,stat,comm,args                                                                                   
[1]+  Killed                     ./parent.sh                                                                                                          
/ # kill -9 17                                                                                                                                        
/ # ps axf -o pid,ppid,stat,comm,args                                                                                                                 
PID   PPID  STAT COMMAND          COMMAND                                                                                                             
    1     0 S    sleep            sleep 1d                                                                                                            
    6     0 S    sh               sh                                                                                                                  
   17     1 Z    sh               [sh]                                                                                                                
   55     1 Z    sleep            [sleep]                                                                                                             
   56     6 R    ps               ps axf -o pid,ppid,stat,comm,args

@lifupan
Copy link
Member

lifupan commented Jan 31, 2019

Hi @BetaXOi

I cannot reproduce your case for v1, what's the containerd version and host you used?
Here is my case:

root@kata-benchmark:# ctr run -t -d docker.io/library/alpine:latest v1 sleep 1d  
root@kata-benchmark:# ctr t exec -t --exec-id 100 v1 sh 
/ # echo "sh ./child.sh" |tee parent.sh 
sh ./child.sh
/ # echo "while true; do sleep 10; done" |tee child.sh  
while true; do sleep 10; done
/ # chmod +x *.sh
/ # ./parent.sh &
/ # ps axf -o pid,ppid,stat,comm,args
PID   PPID  STAT COMMAND          COMMAND
    1     0 S    sleep            sleep 1d
    6     0 S    sh               sh
   17     6 S    busybox          {busybox} ash ./parent.sh
   18    17 S    sh               sh ./child.sh
   60    18 S    sleep            sleep 10
   61     6 R    ps               ps axf -o pid,ppid,stat,comm,args
/ # 
/ # 
/ # kill -9 17 
/ # ps axf -o pid,ppid,stat,comm,args 
PID   PPID  STAT COMMAND          COMMAND
    1     0 S    sleep            sleep 1d
    6     0 S    sh               sh
   18     1 S    sh               sh ./child.sh
   66    18 S    sleep            sleep 10
   67     6 R    ps               ps axf -o pid,ppid,stat,comm,args
[1]+  Killed                     ./parent.sh
/ # 
/ # kill -9 18 
/ # 
/ # ps axf -o pid,ppid,stat,comm,args 
PID   PPID  STAT COMMAND          COMMAND
    1     0 S    sleep            sleep 1d
    6     0 S    sh               sh
   18     1 Z    sh               [sh]
   69     1 S    sleep            sleep 10
   70     6 R    ps               ps axf -o pid,ppid,stat,comm,args
/ # 

@lifupan
Copy link
Member

lifupan commented Feb 12, 2019

Hi @BetaXOi, Any comments?

@lifupan
Copy link
Member

lifupan commented Feb 13, 2019

Hi @BetaXOi, I spent some time to investigate this issue, and found that:
1: kata agent does set CHILD SUBREAPER in https://github.com/kata-containers/agent/blob/ca9d52094d41af3e43f41b88388e9ce11222c8c6/agent.go#L626
2: For the zombie issue, it's latest's kernel feature, not a bug.

In fact, you can also reproduce the zombie process in runC with host kernel later than 4.11.
and what you had saw on runC without the zombie process must be on the kernel older than 4.11.
What triggered this difference is this kernel patch: https://lkml.org/lkml/2017/1/30/630

I think you had misunderstood the kernel's feature "CHILD SUBREAPER". Actually The child subreaper cannot
reap a child from different pid namespace. Read from the latest kernel source codes https://elixir.bootlin.com/linux/latest/source/kernel/exit.c#L603, the logic to find a process's reaper is below:

  1. give them to another thread in it's parent's thread group, if such a member exists.
  2. give it to the first ancestor process which prctl'd itself as a
    child_subreaper for its children in the same pid namespace.
  3. give it to the init process (PID 1) in our pid namespace

That's why kernel reparent the child process to container's (PID 1) process instead of the shim/kata-agent process.

@bergwolf @gnawux @sboeuf WDYT?

@gnawux
Copy link
Member

gnawux commented Feb 13, 2019

Thanks for the clarification, @lifupan.

Yes, as @lifupan said, it's a feature, it's not a bug 😂

The zombies are caused by abusing of the kernel defect which has been fixed in new kernels, and they are not related to any kata features.

However, as there are users depending on the fixed behavior, I think we should document this case at least.

@gnawux
Copy link
Member

gnawux commented Feb 13, 2019

@BetaXOi
but i don't think this is a good idea. It breaks the containerd original architecture. see https://github.com/crosbymichael/dockercon-2016/blob/master/Creating%20Containerd.pdf

What you referenced was presented in 2016, and the shimv2 was introduced in containerd in 2018 ( containerd/containerd#2434), and the containerd PR was self-documented.

In the issue #485, which was implemented in #572, we discussed with containerd guys including Michael Crosby and made sure the implementation followed the latest containerd principles and architecture. The future runtimes were encouraged to implement the shimv2 API instead of the non-standard and too process dedicated runC compatible command line interface.

Hope my words could help you understand where the shimv2 came from, @BetaXOi .

@BetaXOi BetaXOi closed this as completed Mar 6, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants