False Positive #912

piflav · 2022-01-26T12:40:52Z

Hi,
since a few months I'm SimpleMonitoring 150+ hosts from a Windows Server.
Very basic just ping every 1 min plus Pushover notifications and HTML status page:

[HostName]
type=ping
host=172.x.y.z
tolerance=5

It works fine but I've realized when an Host is down for long time another one is often reported up and down every 10/15 mins even if (checked pinging directly from command line) no packet was really lost.
It looks like the false positive problem is reported for the Host immediately before in the configuration file of the one really down.
For example:

#Host reported flapping even if UP
[Host-A]
type=ping
host=172.x.y.z
tolerance=5

#Host DOWN since long time
[Host-B]
type=ping
host=172.x.y.z
tolerance=5

If I comment the Host-B configuration the problem disappear.
My Python knowledge is very limited so I didn't go trough the code to find where the problem could be.

Thanks

The text was updated successfully, but these errors were encountered:

jamesoff · 2022-01-26T12:46:42Z

Interesting; could you let me know what version you're using (and which Python version)?

Is it always the host above the failed one which flaps? Any feel for roughly how long "Host-B" would need to be down for the problem to manifest?

piflav · 2022-01-26T14:19:42Z

Hi, I'm using Python 3.9.6 on Windows Server 2016 standard. SimpleMonitor is 1.11.0 *Is it always the host above the failed one which flaps?* I think so. *Any feel for roughly how long "Host-B" would need to be down for the problem to manifest?* It looks random. Also the up&down timing could be 15mins then 2mins ....

…

On Wed, Jan 26, 2022 at 1:46 PM James Seward ***@***.***> wrote: Interesting; could you let me know what version you're using (and which Python version)? Is it always the host above the failed one which flaps? Any feel for roughly how long "Host-B" would need to be down for the problem to manifest? — Reply to this email directly, view it on GitHub <#912 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEL3V5K7CXSVGA5BKH6BTATUX7UMBANCNFSM5M25ZVAA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you authored the thread.Message ID: ***@***.***>

jamesoff · 2022-01-26T14:30:32Z

Thanks for the info, I'll have a go at reproducing it. Hope the workaround of disabling/removing the long-term down host is ok for you for now.

piflav · 2022-02-03T15:50:39Z

Hi,
I've realized today another issue probably related to the same bug.
About 5 of 150+ hosts monitored report ping time 0.000ms which is impossible because hundreds of km away.
As example here the logs related to same location:

2022-02-03 15:01:38+01:00 FR-Saint-Denis-VRRP: ok (0.000s) (Ping time 15.584ms)
2022-02-03 15:01:38+01:00 FR-Saint-Denis-LAN1: ok (0.000s) (Ping time 0.000ms)
2022-02-03 15:01:38+01:00 FR-Saint-Denis-LAN2: ok (0.000s) (Ping time 15.616ms)
2022-02-03 15:01:38+01:00 FR-Saint-Denis-L3: ok (0.000s) (Ping time 15.626ms)
2022-02-03 15:01:38+01:00 FR-Saint-Denis-LB1: ok (0.000s) (Ping time  0.000ms)
2022-02-03 15:01:38+01:00 FR-Saint-Denis-LB2: ok (0.000s) (Ping time 15.621ms)

I suspect it's a bug of ping3 maybe due to the fact I'm asking to ping 150+ host every 1 min and the time between pings is too short.

How do you manage that?

jamesoff · 2022-02-04T10:10:37Z

Agreed, that is odd. Not sure what's going on there, but if it's legit I want that network :)

Is it always those hosts?

Could you maybe try changing them to the host monitor? This is the original one for pinging hosts and works by actually running ping rather than being implemented in Python.

piflav · 2022-02-04T15:39:34Z

I've replace ping with host and it looks like all works as expected even if a host is down since 1 hour.
To report no details about round trip on HTML page and logs.
I've not set ping_regexp and time_regexp (default automatic)

FYI the output of the ping command on the server is

C:\>ping -n 1 -w 1000 172.20.51.1

Pinging 172.20.51.1 with 32 bytes of data:
Reply from 172.20.51.1: bytes=32 time=26ms TTL=56

Ping statistics for 172.20.51.1:
    Packets: Sent = 1, Received = 1, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 26ms, Maximum = 26ms, Average = 26ms

jamesoff · 2022-02-04T17:40:10Z

Glad that's fixed the weird behaviour for those hosts. It should include the ping time in the detail field; I'll take a look to see if I can see why it isn't.

piflav · 2022-12-28T07:20:59Z

No difference using either ping or host.
The temporary solution is disabling multithreading with -j 1

jamesoff · 2022-12-29T15:55:57Z

Thanks for the update. I'm also seeing this with a couple of my monitors recently (I have some kit unplugged so it's definitely not going to be up, despite what SimpleMonitor is occasionally reporting ;)

Interesting to know disabling multithreading helps, I'll have a look upstream at the library I'm using for it to see if there's any fix.

jamesoff · 2022-12-29T16:01:08Z

That didn't take long to track down; the library has an issue with multithreading: kyan001/ping3#26

I wonder if I can support both (multithreading and correct pings) by keeping all the ping monitors on one thread 🤔

jamesoff added the bug label Jan 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

False Positive #912

False Positive #912

piflav commented Jan 26, 2022 •

edited

Loading

jamesoff commented Jan 26, 2022

piflav commented Jan 26, 2022 via email

jamesoff commented Jan 26, 2022

piflav commented Feb 3, 2022 •

edited

Loading

jamesoff commented Feb 4, 2022

piflav commented Feb 4, 2022

jamesoff commented Feb 4, 2022

piflav commented Dec 28, 2022

jamesoff commented Dec 29, 2022

jamesoff commented Dec 29, 2022

False Positive #912

False Positive #912

Comments

piflav commented Jan 26, 2022 • edited Loading

jamesoff commented Jan 26, 2022

piflav commented Jan 26, 2022 via email

jamesoff commented Jan 26, 2022

piflav commented Feb 3, 2022 • edited Loading

jamesoff commented Feb 4, 2022

piflav commented Feb 4, 2022

jamesoff commented Feb 4, 2022

piflav commented Dec 28, 2022

jamesoff commented Dec 29, 2022

jamesoff commented Dec 29, 2022

piflav commented Jan 26, 2022 •

edited

Loading

piflav commented Feb 3, 2022 •

edited

Loading