[docs] Add information about default alert settings #611 fixes #611 #640

Unnati-Gupta24 · 2025-02-21T09:32:15Z

Checklist

I have read the OpenWISP Contributing Guidelines.
I have manually tested the changes proposed in this pull request.
I have written new test cases for new code and/or updated existing tests for changes to existing code.
I have updated the documentation.

Reference to Existing Issue

Closes #611.

Description of Changes

Ping Dependency: The device status instantly becomes "critical" if the ping check is unable to contact the management interface, which is assumed to be properly configured.

Config Applied Timing: Temporary delays are avoided by using a 5-minute tolerance.

WiFi Clients: Expected network demand is reflected in the maximum and minimum client thresholds.

Iperf3: By default, alerts are turned off, but they can be turned on.

@pandafy @nemesifier please review it.

We can find the reference of changes in file
openwisp_monitoring/monitoring/configuration.py

DEFAULT_METRICS = {
    'ping': {
        'label': _('Ping'),
        'name': 'Ping',
        'key': 'ping',
        'field_name': 'reachable',
        'related_fields': ['loss', 'rtt_min', 'rtt_max', 'rtt_avg'],
        'charts': {
            'uptime': {
                'type': 'bar',
                'title': _('Ping Success Rate'),
                'description': _(
                    'A value of 100% means reachable, 0% means unreachable, values in '
                    'between 0% and 100% indicate the average reachability in the '
                    'period observed. Obtained with the fping linux program.'
                ),
              .
              .
              .
              .
        'alert_settings': {'operator': '<', 'threshold': 1, 'tolerance': 0},
        'notification': {
            'problem': {
                'verbose_name': 'Ping PROBLEM',
                'verb': _('is not reachable'),
                'level': 'warning',
                'email_subject': _(
                    '[{site.name}] PROBLEM: {notification.target} {notification.verb}'
                ),
                'message': _(
                    'The device [{notification.target}]({notification.target_link}) '
                    '{notification.verb}.'
                ),
            },
            'recovery': {
                'verbose_name': 'Ping RECOVERY',
                'verb': _('is reachable again'),
                'level': 'info',
                'email_subject': _(
                    '[{site.name}] RECOVERY: {notification.target} {notification.verb}'
                ),
                'message': _(
                    'The device [{notification.target}]({notification.target_link}) '
                    '{notification.verb}.'
                ),
            },
        },
    },
    'config_applied': {
        'label': _('Configuration Applied'),
        'name': 'Configuration Applied',
        'key': 'config_applied',
        'field_name': 'config_applied',
        'alert_settings': {'operator': '<', 'threshold': 1, 'tolerance': 5},
        'notification': {
            'problem': {
                'verbose_name': 'Configuration Applied PROBLEM',
                'verb': _('has not been applied'),
                'level': 'warning',
                'email_subject': _(
                    '[{site.name}] PROBLEM: {notification.target} configuration '
                    'status issue'
                ),
                'message': _(
                    'The configuration of device [{notification.target}]'
                    '({notification.target_link}) {notification.verb} in a timely manner.'
                ),
            },
            'recovery': {
                'verbose_name': 'Configuration Applied RECOVERY',
                'verb': _('configuration has been applied again'),
                'level': 'info',
                'email_subject': _(
                    '[{site.name}] RECOVERY: {notification.target} {notification.verb} '
                    'successfully'
                ),
                'message': _(
                    'The configuration of device [{notification.target}]({notification.target_link}) '
                    '{notification.verb} successfully.'
                ),
            },
        },
    },
  .
  .
  .
  .
    'wifi_clients_max': {
        'label': _('WiFi Clients (Maximum)'),
        'name': '{name}',
        'key': 'wifi_clients_max',
        'field_name': 'clients',
        'alert_settings': {'operator': '>', 'threshold': 50, 'tolerance': 120},
        'notification': {
            'problem': {
                'verbose_name': 'Max WiFi clients PROBLEM',
                'verb': _('exceeds the expected threshold'),
                'level': 'warning',
                'email_subject': _(
                    '[{site.name}] PROBLEM: {notification.target} has too many WiFi clients'
                ),
                'message': _(
                    'The WiFi client count on [{notification.target}]({notification.target_link})'
                    ' {notification.verb}.'
                ),
            },
            'recovery': {
                'verbose_name': 'Max WiFi clients RECOVERY',
                'verb': _('has decreased'),
                'level': 'info',
                'email_subject': _(
                    '[{site.name}] RECOVERY: {notification.target} WiFi client count has returned to normal'
                ),
                'message': (
                    'The WiFi client count on  [{notification.target}]({notification.target_link})'
                    ' {notification.verb} and is now within the expected range.'
                ),
            },
        },
    },
    'wifi_clients_min': {
        'label': _('WiFi Clients (Minimum)'),
        'name': '{name}',
        'key': 'wifi_clients_min',
        'field_name': 'clients',
        'alert_settings': {'operator': '<', 'threshold': 1, 'tolerance': 0},
        'notification': {
            'problem': {
                'verbose_name': 'Min WiFi clients PROBLEM',
                'verb': _('is below the expected threshold'),
                'level': 'warning',
                'email_subject': _(
                    '[{site.name}] PROBLEM: {notification.target} has too few WiFi clients'
                ),
                'message': _(
                    'The WiFi client count on [{notification.target}]({notification.target_link})'
                    ' {notification.verb}.'
                ),
            },
            'recovery': {
                'verbose_name': 'Min WiFi clients RECOVERY',
                'verb': _('has increased'),
                'level': 'info',
                'email_subject': _(
                    '[{site.name}] RECOVERY: {notification.target} has WiFi clients connecting again'
                ),
                'message': (
                    'The WiFi client count on [{notification.target}]({notification.target_link})'
                    ' {notification.verb} and is now within the expected range.'
                ),
            },
        },
    },
  .
  .
  .
 .
            'recovery': {
                'verbose_name': 'Disk usage RECOVERY',
                'verb': _('has returned to normal levels'),
                'level': 'info',
                'email_subject': _(
                    '[{site.name}] RECOVERY: {notification.target} disk usage '
                    '{notification.verb}'
                ),
                'message': (
                    'The device [{notification.target}]({notification.target_link}) '
                    'disk usage {notification.verb}.'
                ),
            },
        },
    },
    'memory': {
        'label': _('Memory usage'),
        'name': 'Memory usage',
        'key': 'memory',
        'field_name': 'percent_used',
        'related_fields': [
            'total_memory',
            'free_memory',
            'buffered_memory',
            'shared_memory',
            'cached_memory',
            'available_memory',
        ],
        'charts': {
            'memory': {
                'type': 'scatter',
                'title': _('Memory Usage'),
                'description': _('Percentage of memory (RAM) being used.'),
                'summary_labels': [_('Memory Usage')],
                'unit': '%',
                'colors': [DEFAULT_COLORS[4]],
                'order': 250,
                'query': chart_query['memory'],
            }
        },
        'alert_settings': {'operator': '>', 'threshold': 95, 'tolerance': 5},
        'notification': {
            'problem': {
                'verbose_name': 'Memory usage PROBLEM',
                'verb': _('is experiencing a peak in'),
                'level': 'warning',
                'email_subject': _(
                    '[{site.name}] PROBLEM: {notification.target} {notification.verb} RAM usage'
                ),
                'message': _(
                    'The device [{notification.target}]({notification.target_link}) '
                    '{notification.verb} RAM usage which has gone '
                    'over {notification.actor.alertsettings.threshold}%.'
                ),
            },
            'recovery': {
                'verbose_name': 'Memory usage RECOVERY',
                'verb': _('has returned to normal levels'),
                'level': 'info',
                'email_subject': _(
                    '[{site.name}] RECOVERY: {notification.target} RAM usage {notification.verb}'
                ),
                'message': (
                    'The device [{notification.target}]({notification.target_link}) RAM usage '
                    '{notification.verb}.'
                ),
            },
        },
    },
    'cpu': {
        'label': _('CPU usage'),
        'name': 'CPU usage',
        'key': 'cpu',
        'field_name': 'cpu_usage',
        'related_fields': ['load_1', 'load_5', 'load_15'],
        'charts': {
            'cpu': {
                'type': 'scatter',
                'title': _('CPU Load'),
                'description': _(
                    'Average CPU load, measured using the Linux load averages, '
                    'taking into account the number of available CPUs.'
                ),
                'summary_labels': [_('CPU Load')],
                'unit': '%',
                'colors': [DEFAULT_COLORS[-3]],
                'order': 260,
                'query': chart_query['cpu'],
            }
        },
        'alert_settings': {'operator': '>', 'threshold': 90, 'tolerance': 5},
        'notification': {
            'problem': {
                'verbose_name': 'CPU usage PROBLEM',
                'verb': _('is experiencing a peak in'),
                'level': 'warning',
                'email_subject': _(
                    '[{site.name}] PROBLEM: {notification.target} {notification.verb} CPU usage'
                ),
                'message': _(
                    'The device [{notification.target}]({notification.target_link}) '
                    '{notification.verb} CPU usage which has gone '
                    'over {notification.actor.alertsettings.threshold}%.'
                ),
            },

…openwisp#611

Fixes openwisp#611

Unnati-Gupta24 · 2025-02-23T17:33:40Z

I have done qa-checks locally now.
restructured the files and made some changes in the file now all the tests are passing.

Unnati-Gupta24 added 3 commits February 21, 2025 14:46

[docs] Add information about default alert settings openwisp#611 fixes …

4b0074a

…openwisp#611

[docs] Add information about default alert settings openwisp#611 fixes …

40117b1

…openwisp#611

[docs] Add information about default alert settings openwisp#611

c07471e

Fixes openwisp#611

devkapilbansal self-requested a review February 23, 2025 18:56

devkapilbansal added the documentation Improvements or additions to documentation label Feb 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[docs] Add information about default alert settings #611 fixes #611 #640

[docs] Add information about default alert settings #611 fixes #611 #640

Unnati-Gupta24 commented Feb 21, 2025 •

edited

Loading

Unnati-Gupta24 commented Feb 23, 2025

[docs] Add information about default alert settings #611 fixes #611 #640

Are you sure you want to change the base?

[docs] Add information about default alert settings #611 fixes #611 #640

Conversation

Unnati-Gupta24 commented Feb 21, 2025 • edited Loading

Checklist

Reference to Existing Issue

Description of Changes

Unnati-Gupta24 commented Feb 23, 2025

Unnati-Gupta24 commented Feb 21, 2025 •

edited

Loading