Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPINE Poller reports the devices down, however, the devices are up #348

Closed
MSS970 opened this issue Apr 25, 2024 · 7 comments
Closed

SPINE Poller reports the devices down, however, the devices are up #348

MSS970 opened this issue Apr 25, 2024 · 7 comments
Labels

Comments

@MSS970
Copy link

MSS970 commented Apr 25, 2024

Two Errors:

SPINE: Poller[Main Poller] Device[device-name] Hostname[ip_address] ERROR: HOST EVENT: Device is DOWN Message: ICMP: Ping timed out
SPINE: Poller[[Main Poller] Device[device-name] Hostname[ip_address] ERROR: HOST EVENT: Device is DOWN Message: Device did not respond to SNMP, ICMP: Device is Alive
However, when click on the device (edit) page, both SNMP and ICMP test results show the device is up.
When attempting to access the device, it is accessible, up, running, operational.

To Reproduce
Steps to reproduce the behavior:

Go to logs, search for SPINE logs with error.
The above 2 kind of errors are found.
Click on the device, the device is found up and running, the SNMP and ICMP test results are displayed.
OS: Cacti 1.3 [dev] on Windows 2019 server
SPINE: 1.3
FPING: 4.2 for Windows
Net-SNMP 5.9.3 for Windows
Can you kindly extend your support to fix this problem.

@MSS970
Copy link
Author

MSS970 commented Apr 26, 2024

Hi Sean and TheWitness,
The blow information are a copy of those in the closed ticket (Cacti/cacti#5735).

What availability check do you have set for the device. Is it just ping or snmp and ping?
for some devices, I've configured the availability check with ping only as the SNMP is not required in these cases.
other devices, the availability check is configured with snmp and ping.
However, it the same issue for both above type of devices.

. /spine -R -V 5 -f device_ID -l device_id
Below are the results:

D:>spine -R -v 5 - f 36 -l 36
D:\cacti\spine>spine -R -V 5 -f 36 -l 36
SPINE: Using spine config file [spine.conf]
0 [] spine 1527 cygwin_exception::open_stackdumpfile: Dumping stack trace to spine.exe.stackdump

D:\cacti\spine>

and below is the content of the spine.exe.stackdump:

Exception: STATUS_STACK_OVERFLOW at rip=0001004137B6
rax=000000000000E200 rbx=0000000100426060 rcx=00000007FFE03DD0
rdx=00000001004671CC rsi=0000000000000000 rdi=0000000100497E00
r8 =0000000A0005DF50 r9 =00000000FFFFFFFE r10=0000000800000000
r11=0000000100405462 r12=00000007FFF01440 r13=0000000100497E10
r14=0000000A0005E070 r15=000000010041B75D
rbp=00000007FFF00E40 rsp=00000007FFF00DB8
program=D:\cacti\spine\spine.exe, pid 1527, thread
cs=0033 ds=002B es=002B fs=0053 gs=002B ss=002B
Stack trace:
Frame Function Args
0007FFF00E40 0001004137B6 (000100402A75, 000100426060, 000000000000, 000100497E00) spine.exe+0x137B6
0007FFF00E40 00000010A200 (000100426060, 000000000000, 000100497E00, 0007FFF01440)
0007FFF00E40 00010041B780 (000000000000, 000100497E00, 0007FFF01440, 000100497E10) spine.exe+0x1B780
0007FFF00E40 000100402A75 (00010041B3CC, 00010041B6AB, 000A0005E070, 000000000000) spine.exe+0x2A75
0007FFF00E40 000100405EC3 (000000000000, 000000000000, 000100425040, 000100418DBB) spine.exe+0x5EC3
000100418DB6 000100414687 (0007FFFFCC60, 000000000000, 00000000000A, 0007FFFFCD30) spine.exe+0x14687
0007FFFFCD30 7FF9BA0F80C1 (000000000000, 000000000000, 000000000000, 000000000000) cygwin1.dll+0x80C1
0007FFFFFFF0 7FF9BA0F5C86 (000000000000, 000000000000, 000000000000, 000000000000) cygwin1.dll+0x5C86
0007FFFFFFF0 7FF9BA0F5D34 (000000000000, 000000000000, 000000000000, 000000000000) cygwin1.dll+0x5D34
End of stack trace
Loaded modules:
000100400000 spine.exe
7FFA0CBC0000 ntdll.dll
7FFA0C230000 KERNEL32.DLL
7FFA09390000 KERNELBASE.dll
0003FE3B0000 cygmariadb-3.dll
7FF9BA0F0000 cygwin1.dll
0003FE460000 cygnetsnmp-35.dll
0003FF980000 cygcrypto-1.1.dll
0003FE7C0000 cygiconv-2.dll
7FFA0B920000 ADVAPI32.dll
7FFA0C420000 msvcrt.dll
0003FCE60000 cygssl-1.1.dll
0003FC800000 cygz.dll
7FFA0BB20000 sechost.dll
7FFA09E60000 RPCRT4.dll
7FFA08EE0000 bcrypt.dll
7FFA08530000 CRYPTBASE.DLL
7FFA08FC0000 bcryptPrimitives.dll
7FF9FD9F0000 netapi32.dll
7FFA08190000 LOGONCLI.DLL
7FFA090E0000 ucrtbase.dll
7FFA08180000 NETUTILS.DLL
7FFA0A090000 wldap32.dll
7FFA0BAA0000 WS2_32.DLL
7FFA08360000 mswsock.dll
7FFA08AA0000 SspiCli.dll
7FFA028A0000 DSPARSE.dll
7FFA08420000 kerberos.DLL
7FFA08BF0000 MSASN1.dll
7FFA08E40000 msvcp_win.dll
7FFA083D0000 cryptdll.dll
7FFA03D90000 wshqos.dll
7FFA03BB0000 wshtcpip.DLL
7FFA03B20000 wship6.dll
7FFA080B0000 DNSAPI.dll
7FFA0BB10000 NSI.dll
7FFA08070000 IPHLPAPI.DLL
7FFA02E00000 rasadhlp.dll
7FFA04AA0000 fwpuclnt.dll
7FFA04C20000 SAMCLI.DLL
7FF9FFA50000 SAMLIB.dll
7FFA0A150000 user32.dll
7FFA090A0000 win32u.dll
7FFA0A100000 GDI32.dll
7FFA091E0000 gdi32full.dll
7FFA0CB60000 IMM32.DLL
7FF9FEA20000 napinsp.dll
7FF9FEA90000 winrnr.dll
7FFA03BC0000 NLAapi.dll
7FF9FEAC0000 wshbth.dll

@MSS970
Copy link
Author

MSS970 commented Apr 26, 2024

Furthermore, I have run the spine as follows:

D:\cacti\spine>spine --readonly --poller=1 --conf=d:/cacti/spine/spine.conf --verbosity=1 --first=36 --last=36
SPINE: Using spine config file [d:/cacti/spine/spine.conf]
Version 1.3.0 starting
Time: 1.3484 s, Threads: 1, Devices: 1

@MSS970
Copy link
Author

MSS970 commented Apr 26, 2024

How is fping entered in the settings? Use forward slashes and not back slashes.
fping is running with slashes not backslashes.

@TheWitness
Copy link
Member

@MSS970, So, can you please look into the new permission on Windows to create RAW sockets? Microsoft introduced a new permission level that blocks RAW sockets (aka ICMP ping) unless you have a very specific permission set either for the binary (like Linux/UNIX), or for the user. Please provide feedback once you've completed your investigation.

@TheWitness
Copy link
Member

TheWitness commented Aug 5, 2024

This:

image

From here:

https://learn.microsoft.com/en-us/windows/win32/winsock/tcp-ip-raw-sockets-2

Looks like the user needs to be in the Local Administrators group. And I suspect if you are testing, you have to open the console as an Administrator due to UAC.

@MSS970
Copy link
Author

MSS970 commented Aug 8, 2024

@TheWitness I will check and update, however after 20th Aug.

@MSS970
Copy link
Author

MSS970 commented Aug 21, 2024

@TheWitness,
I have checked and tested using a user who is a member of the Local Administrators group, a new subsequent incident has occurred:
The scheduled script "php.exe -c d:\cacti\php\php.ini -f D:\cacti\apache\htdocs\cacti\poller.php" has failed to run.

So I had to revert back to run the schedule script using the built-in "SYSTEM" account.

To resolve the incident, I have increased the device/host failure count to 3 polling intervals.

Please proceed with the ticket closure.

@MSS970 MSS970 closed this as completed Dec 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants