Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agent crash with compiled module 3.4.10-3.4.15 #110

Open
alexmirtoff opened this issue Dec 12, 2018 · 9 comments
Open

agent crash with compiled module 3.4.10-3.4.15 #110

alexmirtoff opened this issue Dec 12, 2018 · 9 comments

Comments

@alexmirtoff
Copy link

cat /etc/os-release 
NAME="SLES"
VERSION="12-SP3"
VERSION_ID="12.3"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP3"
docker version
Client:
 Version:           18.09.0
 API version:       1.39
 Go version:        go1.10.4
 Git commit:        33a45cd
 Built:             Wed Nov  7 00:25:11 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Enterprise
 Engine:
  Version:          18.09.0
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.4
  Git commit:       33a45cd
  Built:            Wed Nov  7 00:19:46 2018
  OS/Arch:          linux/amd64
  Experimental:     false
*** Error in `/usr/sbin/zabbix-agentd: listener #3 [processing request]': munmap_chunk(): invalid pointer: 0x00007f0fd32e4840 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x740ef)[0x7f0fd3b9e0ef]
/lib64/libc.so.6(+0x79646)[0x7f0fd3ba3646]
/usr/lib/modules/zabbix_module_docker.so(zbx_module_docker_net+0x636)[0x7f0fd3704cb3]
/usr/sbin/zabbix-agentd: listener #3 [processing request](process+0x353)[0x4185e3]
/usr/sbin/zabbix-agentd: listener #3 [processing request](listener_thread+0x1ad)[0x41513d]
/usr/sbin/zabbix-agentd: listener #3 [processing request](zbx_thread_start+0x3e)[0x42c45e]
/usr/sbin/zabbix-agentd: listener #3 [processing request](MAIN_ZABBIX_ENTRY+0x2c3)[0x417883]
/usr/sbin/zabbix-agentd: listener #3 [processing request](daemon_start+0x1a9)[0x42cf09]
/usr/sbin/zabbix-agentd: listener #3 [processing request](main+0x9e)[0x40d1fe]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f0fd3b4a725]
/usr/sbin/zabbix-agentd: listener #3 [processing request](_start+0x29)[0x40d309]
======= Memory map: ========
00400000-00457000 r-xp 00000000 fe:01 1052                               /usr/sbin/zabbix-agentd
00656000-00657000 r--p 00056000 fe:01 1052                               /usr/sbin/zabbix-agentd
00657000-00659000 rw-p 00057000 fe:01 1052                               /usr/sbin/zabbix-agentd
00659000-0065e000 rw-p 00000000 00:00 0 
00e22000-00e43000 rw-p 00000000 00:00 0                                  [heap]
00e43000-00e47000 rw-p 00000000 00:00 0                                  [heap]
7f0fd30cb000-7f0fd30e1000 r-xp 00000000 fe:00 299                        /lib64/libgcc_s.so.1
7f0fd30e1000-7f0fd32e0000 ---p 00016000 fe:00 299                        /lib64/libgcc_s.so.1
7f0fd32e0000-7f0fd32e1000 r--p 00015000 fe:00 299                        /lib64/libgcc_s.so.1
7f0fd32e1000-7f0fd32e2000 rw-p 00016000 fe:00 299                        /lib64/libgcc_s.so.1
7f0fd32e2000-7f0fd32e7000 r-xp 00000000 fe:00 350                        /lib64/libnss_dns-2.22.so
7f0fd32e7000-7f0fd34e6000 ---p 00005000 fe:00 350                        /lib64/libnss_dns-2.22.so
7f0fd34e6000-7f0fd34e7000 r--p 00004000 fe:00 350                        /lib64/libnss_dns-2.22.so
7f0fd34e7000-7f0fd34e8000 rw-p 00005000 fe:00 350                        /lib64/libnss_dns-2.22.so
7f0fd34e8000-7f0fd34f3000 r-xp 00000000 fe:00 1191                       /lib64/libnss_files-2.22.so
7f0fd34f3000-7f0fd36f2000 ---p 0000b000 fe:00 1191                       /lib64/libnss_files-2.22.so
7f0fd36f2000-7f0fd36f3000 r--p 0000a000 fe:00 1191                       /lib64/libnss_files-2.22.so
7f0fd36f3000-7f0fd36f4000 rw-p 0000b000 fe:00 1191                       /lib64/libnss_files-2.22.so
7f0fd36f4000-7f0fd36fa000 rw-p 00000000 00:00 0 
7f0fd36fa000-7f0fd370c000 r-xp 00000000 fe:01 1182                       /usr/lib/modules/zabbix_module_docker.so
7f0fd370c000-7f0fd390b000 ---p 00012000 fe:01 1182                       /usr/lib/modules/zabbix_module_docker.so
7f0fd390b000-7f0fd390c000 r--p 00011000 fe:01 1182                       /usr/lib/modules/zabbix_module_docker.so
7f0fd390c000-7f0fd390d000 rw-p 00012000 fe:01 1182                       /usr/lib/modules/zabbix_module_docker.so
7f0fd390d000-7f0fd3925000 r-xp 00000000 fe:00 1631                       /lib64/libpthread-2.22.so
7f0fd3925000-7f0fd3b24000 ---p 00018000 fe:00 1631                       /lib64/libpthread-2.22.so
7f0fd3b24000-7f0fd3b25000 r--p 00017000 fe:00 1631                       /lib64/libpthread-2.22.so
7f0fd3b25000-7f0fd3b26000 rw-p 00018000 fe:00 1631                       /lib64/libpthread-2.22.so
7f0fd3b26000-7f0fd3b2a000 rw-p 00000000 00:00 0 
7f0fd3b2a000-7f0fd3cc5000 r-xp 00000000 fe:00 140                        /lib64/libc-2.22.so
7f0fd3cc5000-7f0fd3ec5000 ---p 0019b000 fe:00 140                        /lib64/libc-2.22.so
7f0fd3ec5000-7f0fd3ec9000 r--p 0019b000 fe:00 140                        /lib64/libc-2.22.so
7f0fd3ec9000-7f0fd3ecb000 rw-p 0019f000 fe:00 140                        /lib64/libc-2.22.so
7f0fd3ecb000-7f0fd3ecf000 rw-p 00000000 00:00 0 
7f0fd3ecf000-7f0fd3f3d000 r-xp 00000000 fe:01 1207                       /usr/lib64/libpcre.so.1.2.7
7f0fd3f3d000-7f0fd413c000 ---p 0006e000 fe:01 1207                       /usr/lib64/libpcre.so.1.2.7
7f0fd413c000-7f0fd413d000 r--p 0006d000 fe:01 1207                       /usr/lib64/libpcre.so.1.2.7
7f0fd413d000-7f0fd413e000 rw-p 0006e000 fe:01 1207                       /usr/lib64/libpcre.so.1.2.7
7f0fd413e000-7f0fd4152000 r-xp 00000000 fe:00 1659                       /lib64/libresolv-2.22.so
7f0fd4152000-7f0fd4351000 ---p 00014000 fe:00 1659                       /lib64/libresolv-2.22.so
7f0fd4351000-7f0fd4352000 r--p 00013000 fe:00 1659                       /lib64/libresolv-2.22.so
7f0fd4352000-7f0fd4353000 rw-p 00014000 fe:00 1659                       /lib64/libresolv-2.22.so
7f0fd4353000-7f0fd4355000 rw-p 00000000 00:00 0 
7f0fd4355000-7f0fd4357000 r-xp 00000000 fe:00 306                        /lib64/libdl-2.22.so
7f0fd4357000-7f0fd4557000 ---p 00002000 fe:00 306                        /lib64/libdl-2.22.so
7f0fd4557000-7f0fd4558000 r--p 00002000 fe:00 306                        /lib64/libdl-2.22.so
7f0fd4558000-7f0fd4559000 rw-p 00003000 fe:00 306                        /lib64/libdl-2.22.so
7f0fd4559000-7f0fd4654000 r-xp 00000000 fe:00 328                        /lib64/libm-2.22.so
7f0fd4654000-7f0fd4854000 ---p 000fb000 fe:00 328                        /lib64/libm-2.22.so
7f0fd4854000-7f0fd4855000 r--p 000fb000 fe:00 328                        /lib64/libm-2.22.so
7f0fd4855000-7f0fd4856000 rw-p 000fc000 fe:00 328                        /lib64/libm-2.22.so
7f0fd4856000-7f0fd4877000 r-xp 00000000 fe:00 38                         /lib64/ld-2.22.so
7f0fd4a07000-7f0fd4a61000 rw-s 00000000 00:05 229376                     /SYSV00000000 (deleted)
7f0fd4a61000-7f0fd4a66000 rw-p 00000000 00:00 0 
7f0fd4a75000-7f0fd4a76000 rw-p 00000000 00:00 0 
7f0fd4a76000-7f0fd4a77000 rw-p 00000000 00:00 0 
7f0fd4a77000-7f0fd4a78000 r--p 00021000 fe:00 38                         /lib64/ld-2.22.so
7f0fd4a78000-7f0fd4a79000 rw-p 00022000 fe:00 38                         /lib64/ld-2.22.so
7f0fd4a79000-7f0fd4a7a000 rw-p 00000000 00:00 0 
7ffda273c000-7ffda275d000 rw-p 00000000 00:00 0                          [stack]
7ffda279f000-7ffda27a2000 r--p 00000000 00:00 0                          [vvar]
7ffda27a2000-7ffda27a4000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
 50660:20181212:134920.403 found metric TX-OK: 44000823
 50660:20181212:134920.403 Sending back [44000823]
 50660:20181212:134920.404 __zbx_zbx_setproctitle() title:'listener #1 [waiting for connection]'
 50658:20181212:134920.406 One child process died (PID:50662,exitcode/signal:6). Exiting ...
 50658:20181212:134920.406 zbx_on_exit() called
 50659:20181212:134920.406 Got signal [signal:15(SIGTERM),sender_pid:50658,sender_uid:0,reason:0]. Exiting ...
 50663:20181212:134920.406 Got signal [signal:15(SIGTERM),sender_pid:50658,sender_uid:0,reason:0]. Exiting ...
 50660:20181212:134920.406 Got signal [signal:15(SIGTERM),sender_pid:50658,sender_uid:0,reason:0]. Exiting ...
 50661:20181212:134920.407 Got signal [signal:15(SIGTERM),sender_pid:50658,sender_uid:0,reason:0]. Exiting ...
zabbix-agentd [50658]: Error waiting for process with PID 50662: [10] No child processes
 50658:20181212:134920.407 In zbx_dshm_destroy() shmid:-1
 50658:20181212:134920.407 End of zbx_dshm_destroy():SUCCEED
 50658:20181212:134920.407 In zbx_unload_modules()
 50658:20181212:134920.407 In zbx_module_uninit()
 50658:20181212:134920.408 End of zbx_unload_modules()
 50658:20181212:134920.408 Zabbix Agent stopped. Zabbix 3.4.15 (revision 86739).
@jangaraj
Copy link
Member

Did you compile the module for your system with correct Zabbix version? Could you provide more logs before backtrace, please?

@alexmirtoff
Copy link
Author

Did you compile the module for your system with correct Zabbix version? Could you provide more logs before backtrace, please?

Yes. I have compiled the correct version. These hosts are running as manager-worker mode.
On the host without a cluster everything is fine.
Some logs:

 65158:20181212:145638.254 Requested [docker.mem[d8dc14731a9ca28c2c8f4b2c3063db03f752b751ef40bf17d0c07e169a3e2918,total_cache]]
 65158:20181212:145638.254 In zbx_module_docker_mem()
 65158:20181212:145638.254 In zbx_module_docker_get_fci()
 65158:20181212:145638.254 Original full container id will be used
 65158:20181212:145638.254 Metric source file: /sys/fs/cgroup/memory/docker/d8dc14731a9ca28c2c8f4b2c3063db03f752b751ef40bf17d0c07e169a3e2918/memory.stat
 65158:20181212:145638.254 Looking metric total_cache in memory.stat file
 65158:20181212:145638.254 Id: d8dc14731a9ca28c2c8f4b2c3063db03f752b751ef40bf17d0c07e169a3e2918; metric: total_cache; value: 77406208
 65158:20181212:145638.254 Sending back [77406208]
 65158:20181212:145638.255 __zbx_zbx_setproctitle() title:'listener #3 [waiting for connection]'
 65158:20181212:145638.256 __zbx_zbx_setproctitle() title:'listener #3 [processing request]'
 65158:20181212:145638.257 Requested [docker.mem[3fd2b78b602d02a879dffb33a0073725d38dc04c48959a50b5b115dae7feba9b,total_rss]]
 65158:20181212:145638.257 In zbx_module_docker_mem()
 65158:20181212:145638.257 In zbx_module_docker_get_fci()
 65158:20181212:145638.257 Original full container id will be used
 65158:20181212:145638.257 Metric source file: /sys/fs/cgroup/memory/docker/3fd2b78b602d02a879dffb33a0073725d38dc04c48959a50b5b115dae7feba9b/memory.stat
 65158:20181212:145638.258 Looking metric total_rss in memory.stat file
 65158:20181212:145638.258 Id: 3fd2b78b602d02a879dffb33a0073725d38dc04c48959a50b5b115dae7feba9b; metric: total_rss; value: 73814016
 65158:20181212:145638.258 Sending back [73814016]
 65158:20181212:145638.258 __zbx_zbx_setproctitle() title:'listener #3 [waiting for connection]'
 65157:20181212:145638.260 __zbx_zbx_setproctitle() title:'listener #2 [processing request]'
 65157:20181212:145638.261 Requested [docker.mem[598fb024a76008b3919ba2debe37319d9a96d2a90dce521f23b6dc7c3dd2a648,total_swap]]
 65157:20181212:145638.261 In zbx_module_docker_mem()
 65157:20181212:145638.261 In zbx_module_docker_get_fci()
 65157:20181212:145638.261 Original full container id will be used
 65157:20181212:145638.261 Metric source file: /sys/fs/cgroup/memory/docker/598fb024a76008b3919ba2debe37319d9a96d2a90dce521f23b6dc7c3dd2a648/memory.stat
 65157:20181212:145638.261 Cannot open metric file: '/sys/fs/cgroup/memory/docker/598fb024a76008b3919ba2debe37319d9a96d2a90dce521f23b6dc7c3dd2a648/memory.stat'
 65157:20181212:145638.261 Sending back [ZBX_NOTSUPPORTED: Cannot open memory.stat file]
 65157:20181212:145638.261 __zbx_zbx_setproctitle() title:'listener #2 [waiting for connection]'
 65157:20181212:145638.263 __zbx_zbx_setproctitle() title:'listener #2 [processing request]'
 65157:20181212:145638.264 Requested [docker.up[f508d3c86f17820bf51dea6517045a1ce6dddc457d53ec397c61309ecd6b090e]]
 65157:20181212:145638.264 In zbx_module_docker_up()
 65157:20181212:145638.264 In zbx_module_docker_get_fci()
 65157:20181212:145638.264 Original full container id will be used
 65157:20181212:145638.264 Metric source file: /sys/fs/cgroup/cpu,cpuacct/docker/f508d3c86f17820bf51dea6517045a1ce6dddc457d53ec397c61309ecd6b090e/cpuacct.stat
 65157:20181212:145638.264 Cannot open metric file: '/sys/fs/cgroup/cpu,cpuacct/docker/f508d3c86f17820bf51dea6517045a1ce6dddc457d53ec397c61309ecd6b090e/cpuacct.stat', container doesn't run
 65157:20181212:145638.264 Sending back [0]
 65157:20181212:145638.264 __zbx_zbx_setproctitle() title:'listener #2 [waiting for connection]'
 65157:20181212:145638.266 __zbx_zbx_setproctitle() title:'listener #2 [processing request]'
 65157:20181212:145638.267 Requested [docker.xnet[f3a1997592d3b0dc7cad00e834759e8f699e9e96108d5d6dc0c3d5afe38701a3,eth0,RX-OK]]
 65157:20181212:145638.267 In zbx_module_docker_net()
 65157:20181212:145638.267 In zbx_module_docker_get_fci()
 65157:20181212:145638.267 Original full container id will be used
 65157:20181212:145638.267 netns file: /var/run/netns/zabbix_module_docker_f3a1997592d3b0dc7cad00e834759e8f699e9e96108d5d6dc0c3d5afe38701a3
 65157:20181212:145638.267 Tasks file: /sys/fs/cgroup/devices/docker/f3a1997592d3b0dc7cad00e834759e8f699e9e96108d5d6dc0c3d5afe38701a3/tasks
 65157:20181212:145638.267 Cannot open Docker tasks file: '/sys/fs/cgroup/devices/docker/f3a1997592d3b0dc7cad00e834759e8f699e9e96108d5d6dc0c3d5afe38701a3/tasks'
 65157:20181212:145638.267 Sending back [ZBX_NOTSUPPORTED: Cannot open Docker tasks file]
 65157:20181212:145638.267 __zbx_zbx_setproctitle() title:'listener #2 [waiting for connection]'
 65157:20181212:145638.273 __zbx_zbx_setproctitle() title:'listener #2 [processing request]'
 65157:20181212:145638.274 Requested [docker.xnet[71b227a3c00d0b6862cd82187d9bcd68be4698ece453bc90c3ff8dd6bc3b6f26,eth0,RX-OK]]
 65157:20181212:145638.274 In zbx_module_docker_net()
 65157:20181212:145638.274 In zbx_module_docker_get_fci()
 65157:20181212:145638.274 Original full container id will be used
 65157:20181212:145638.274 netns file: /var/run/netns/zabbix_module_docker_71b227a3c00d0b6862cd82187d9bcd68be4698ece453bc90c3ff8dd6bc3b6f26
 65157:20181212:145638.274 Tasks file: /sys/fs/cgroup/devices/docker/71b227a3c00d0b6862cd82187d9bcd68be4698ece453bc90c3ff8dd6bc3b6f26/tasks
*** Error in `/usr/sbin/zabbix-agentd: listener #2 [processing request]': munmap_chunk(): invalid pointer: 0x00007f9ea11ce840 ***

@jangaraj
Copy link
Member

Problem is with docker.xnet. Did you fulfill requirements mentioned in the Readme?

Note 1: Root permissions (AllowRoot=1) are required, because net namespaces (/var/run/netns/) are created/used
Note 2: netstat is needed to be installed and available in PATH

@alexmirtoff
Copy link
Author

alexmirtoff commented Dec 12, 2018

  1. AllowRoot=1 is set
  2. Netstat is installed and available.

Some network data appeared in Zabbix before the agent died.

@jangaraj
Copy link
Member

Probably it is crashing somewhere in this part

zabbix_log(LOG_LEVEL_DEBUG, "Tasks file: %s", filename2);
FILE *file;
if (NULL == (file = fopen(filename2, "r")))
{
zabbix_log(LOG_LEVEL_ERR, "Cannot open Docker tasks file: '%s'", filename2);
free(container);
free(filename);
free(netns);
free(filename2);
SET_MSG_RESULT(result, strdup("Cannot open Docker tasks file"));
return SYSINFO_RET_FAIL;
}
char line[MAX_STRING_LEN];
char* first_task;
while (NULL != fgets(line, sizeof(line), file))
{
first_task = string_replace(line, "\n", "");
zabbix_log(LOG_LEVEL_DEBUG, "First task for container %s: %s", container, first_task);
break;
}

Probably some pointer for free function is not valid. It will require deeper investigation to prove it.

@forum77alive
Copy link

forum77alive commented Feb 4, 2020

I also have this problem, but I have zabbix-agent version 4.4.3. Debian 9.9

@forum77alive
Copy link

But I downgrade my zabbix-agent to 4.2.8 and compiled .so - it worked!

@Lucefron
Copy link

same problem:
os: ubuntu 18.04, debian 9, debian 10;
agent version: 5.0.12
zabbix_module_docker.so was downloaded from master branch.

@i-ky
Copy link
Contributor

i-ky commented Feb 16, 2022

It looks to me that the problem is here:

char* first_task;
while (NULL != fgets(line, sizeof(line), file))
{
first_task = string_replace(line, "\n", "");
zabbix_log(LOG_LEVEL_DEBUG, "First task for container %s: %s", container, first_task);
break;
}
If fgets() fails, then loop body is never executed and first_task is not initialized and subsequent attempt to release memory: ...will lead to a crash.

The solution would be to convert this while loop into if else construct. However, I don't know what to put in else branch, because I am looking at it purely from C developer's perspective. @jangaraj and the rest, what does it mean if Tasks file is empty? How should module behave in this case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants