Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: remove for from evaluation and adding a threshold line #260

Merged
merged 1 commit into from
Oct 23, 2023

Conversation

geekbrother
Copy link
Contributor

Description

This PR adds changes for the Grafana alert rule and 5xx Errors dashboard:

  • Removing 5xx alert evaluation for 5 minutes, because we will miss alerts when the 5xx error occurs not constantly (less than 5 minutes constantly) and we want to catch every 5xx.
  • Adding a red line dotted threshold for the 5xx Errors Dashboard in Grafana configuration.

Resolves #210

How Has This Been Tested?

Tested by pushing the json config to the Grafana dashboard.

Due Diligence

  • Breaking change
  • Requires a documentation update
  • Requires a e2e/integration test update

@geekbrother geekbrother added the area-telemetry Metrics & Monitoring label Oct 21, 2023
@geekbrother geekbrother self-assigned this Oct 21, 2023
@arein arein added the accepted The issue has been accepted into the project label Oct 21, 2023
@geekbrother geekbrother temporarily deployed to staging October 21, 2023 14:21 — with GitHub Actions Inactive
@github-actions
Copy link
Contributor

Show Plan

[command]/home/runner/work/_temp/3f4c691d-2806-4181-a438-88c52b723148/terraform-bin -chdir=terraform show -no-color /tmp/plan.tfplan

Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  ~ update in-place
-/+ destroy and then create replacement

Terraform will perform the following actions:

  # module.ecs.aws_ecs_service.app_service will be updated in-place
  ~ resource "aws_ecs_service" "app_service" {
        id                                 = "arn:aws:ecs:eu-central-1:898587786287:service/staging-push/staging-push-service"
        name                               = "staging-push-service"
        tags                               = {}
      ~ task_definition                    = "arn:aws:ecs:eu-central-1:898587786287:task-definition/staging-push:126" -> (known after apply)
        # (15 unchanged attributes hidden)

        # (4 unchanged blocks hidden)
    }

  # module.ecs.aws_ecs_task_definition.app_task_definition must be replaced
-/+ resource "aws_ecs_task_definition" "app_task_definition" {
      ~ arn                      = "arn:aws:ecs:eu-central-1:898587786287:task-definition/staging-push:126" -> (known after apply)
      ~ arn_without_revision     = "arn:aws:ecs:eu-central-1:898587786287:task-definition/staging-push" -> (known after apply)
      ~ container_definitions    = (sensitive value) # forces replacement
      ~ id                       = "staging-push" -> (known after apply)
      ~ revision                 = 126 -> (known after apply)
      - tags                     = {} -> null
        # (9 unchanged attributes hidden)

        # (1 unchanged block hidden)
    }

  # module.monitoring.grafana_dashboard.at_a_glance will be updated in-place
  ~ resource "grafana_dashboard" "at_a_glance" {
      ~ config_json  = jsonencode(
          ~ {
              ~ panels               = [
                    # (6 unchanged elements hidden)
                    {
                        datasource  = {
                            type = "cloudwatch"
                            uid  = "XnMFKnQVk"
                        }
                        fieldConfig = {
                            defaults  = {
                                color      = {
                                    mode = "palette-classic"
                                }
                                custom     = {
                                    axisLabel         = ""
                                    axisPlacement     = "auto"
                                    barAlignment      = 0
                                    drawStyle         = "line"
                                    fillOpacity       = 0
                                    gradientMode      = "none"
                                    hideFrom          = {
                                        legend  = false
                                        tooltip = false
                                        viz     = false
                                    }
                                    lineInterpolation = "linear"
                                    lineWidth         = 1
                                    pointSize         = 5
                                    scaleDistribution = {
                                        type = "linear"
                                    }
                                    showPoints        = "auto"
                                    spanNulls         = false
                                    stacking          = {
                                        group = "A"
                                        mode  = "none"
                                    }
                                    thresholdsStyle   = {
                                        mode = "off"
                                    }
                                }
                                mappings   = []
                                thresholds = {
                                    mode  = "absolute"
                                    steps = [
                                        {
                                            color = "green"
                                            value = null
                                        },
                                        {
                                            color = "red"
                                            value = 80
                                        },
                                    ]
                                }
                            }
                            overrides = []
                        }
                        gridPos     = {
                            h = 9
                            w = 7
                            x = 0
                            y = 18
                        }
                        options     = {
                            legend  = {
                                calcs       = []
                                displayMode = "list"
                                placement   = "bottom"
                            }
                            tooltip = {
                                mode = "single"
                                sort = "none"
                            }
                        }
                        targets     = [
                            {
                                alias            = ""
                                datasource       = {
                                    type = "cloudwatch"
                                    uid  = "XnMFKnQVk"
                                }
                                dimensions       = {
                                    LoadBalancer = "app/staging-push-load-balancer/aea5ef9d0a34453a"
                                }
                                expression       = ""
                                id               = ""
                                matchExact       = true
                                metricEditorMode = 0
                                metricName       = "RequestCount"
                                metricQueryType  = 0
                                namespace        = "AWS/ApplicationELB"
                                period           = ""
                                queryMode        = "Metrics"
                                refId            = "A"
                                region           = "default"
                                sqlExpression    = ""
                                statistic        = "Sum"
                            },
                        ]
                        title       = "Requests"
                        type        = "timeseries"
                    },
                  ~ {
                      ~ alert       = {
                          ~ for                 = "5m" -> ""
                            name                = "staging Echo Server 5XX alert"
                            # (8 unchanged attributes hidden)
                        }
                      ~ fieldConfig = {
                          ~ defaults  = {
                              ~ custom     = {
                                  ~ hideFrom          = {
                                      + mode    = "dashed"
                                        # (3 unchanged attributes hidden)
                                    }
                                    # (14 unchanged attributes hidden)
                                }
                                # (3 unchanged attributes hidden)
                            }
                            # (1 unchanged attribute hidden)
                        }
                      + thresholds  = [
                          + {
                              + colorMode = "critical"
                              + op        = "gt"
                              + value     = 1
                              + visible   = true
                            },
                        ]
                        # (6 unchanged attributes hidden)
                    },
                    {
                        datasource  = {
                            type = "cloudwatch"
                            uid  = "XnMFKnQVk"
                        }
                        fieldConfig = {
                            defaults  = {
                                color      = {
                                    mode = "palette-classic"
                                }
                                custom     = {
                                    axisLabel         = ""
                                    axisPlacement     = "auto"
                                    barAlignment      = 0
                                    drawStyle         = "line"
                                    fillOpacity       = 0
                                    gradientMode      = "none"
                                    hideFrom          = {
                                        legend  = false
                                        tooltip = false
                                        viz     = false
                                    }
                                    lineInterpolation = "linear"
                                    lineWidth         = 1
                                    pointSize         = 5
                                    scaleDistribution = {
                                        type = "linear"
                                    }
                                    showPoints        = "auto"
                                    spanNulls         = false
                                    stacking          = {
                                        group = "A"
                                        mode  = "none"
                                    }
                                    thresholdsStyle   = {
                                        mode = "off"
                                    }
                                }
                                mappings   = []
                                thresholds = {
                                    mode  = "absolute"
                                    steps = [
                                        {
                                            color = "green"
                                            value = null
                                        },
                                        {
                                            color = "red"
                                            value = 80
                                        },
                                    ]
                                }
                            }
                            overrides = []
                        }
                        gridPos     = {
                            h = 9
                            w = 7
                            x = 14
                            y = 18
                        }
                        options     = {
                            legend  = {
                                calcs       = []
                                displayMode = "list"
                                placement   = "bottom"
                            }
                            tooltip = {
                                mode = "single"
                                sort = "none"
                            }
                        }
                        targets     = [
                            {
                                alias            = ""
                                datasource       = {
                                    type = "cloudwatch"
                                    uid  = "XnMFKnQVk"
                                }
                                dimensions       = {
                                    LoadBalancer = "app/staging-push-load-balancer/aea5ef9d0a34453a"
                                }
                                expression       = ""
                                id               = ""
                                matchExact       = true
                                metricEditorMode = 0
                                metricName       = "HTTPCode_ELB_4XX_Count"
                                metricQueryType  = 0
                                namespace        = "AWS/ApplicationELB"
                                period           = ""
                                queryMode        = "Metrics"
                                refId            = "A"
                                region           = "default"
                                sqlExpression    = ""
                                statistic        = "Sum"
                            },
                            {
                                alias            = ""
                                datasource       = {
                                    type = "cloudwatch"
                                    uid  = "XnMFKnQVk"
                                }
                                dimensions       = {
                                    LoadBalancer = "app/staging-push-load-balancer/aea5ef9d0a34453a"
                                }
                                expression       = ""
                                id               = ""
                                matchExact       = true
                                metricEditorMode = 0
                                metricName       = "HTTPCode_Target_4XX_Count"
                                metricQueryType  = 0
                                namespace        = "AWS/ApplicationELB"
                                period           = ""
                                queryMode        = "Metrics"
                                refId            = "B"
                                region           = "default"
                                sqlExpression    = ""
                                statistic        = "Sum"
                            },
                        ]
                        title       = "4XX"
                        type        = "timeseries"
                    },
                ]
                tags                 = []
                # (15 unchanged attributes hidden)
            }
        )
        id           = "0:staging-push"
        # (7 unchanged attributes hidden)
    }

Plan: 1 to add, 2 to change, 1 to destroy.
::debug::Terraform exited with code 0.
::debug::stdout: %0ATerraform used the selected providers to generate the following execution%0Aplan. Resource actions are indicated with the following symbols:%0A  ~ update in-place%0A-/+ destroy and then create replacement%0A%0ATerraform will perform the following actions:%0A%0A  # module.ecs.aws_ecs_service.app_service will be updated in-place%0A  ~ resource "aws_ecs_service" "app_service" {%0A        id                                 = "arn:aws:ecs:eu-central-1:898587786287:service/staging-push/staging-push-service"%0A        name                               = "staging-push-service"%0A        tags                               = {}%0A      ~ task_definition                    = "arn:aws:ecs:eu-central-1:898587786287:task-definition/staging-push:126" -> (known after apply)%0A        # (15 unchanged attributes hidden)%0A%0A        # (4 unchanged blocks hidden)%0A    }%0A%0A  # module.ecs.aws_ecs_task_definition.app_task_definition must be replaced%0A-/+ resource "aws_ecs_task_definition" "app_task_definition" {%0A      ~ arn                      = "arn:aws:ecs:eu-central-1:898587786287:task-definition/staging-push:126" -> (known after apply)%0A      ~ arn_without_revision     = "arn:aws:ecs:eu-central-1:898587786287:task-definition/staging-push" -> (known after apply)%0A      ~ container_definitions    = (sensitive value) # forces replacement%0A      ~ id                       = "staging-push" -> (known after apply)%0A      ~ revision                 = 126 -> (known after apply)%0A      - tags                     = {} -> null%0A        # (9 unchanged attributes hidden)%0A%0A        # (1 unchanged block hidden)%0A    }%0A%0A  # module.monitoring.grafana_dashboard.at_a_glance will be updated in-place%0A  ~ resource "grafana_dashboard" "at_a_glance" {%0A      ~ config_json  = jsonencode(%0A          ~ {%0A              ~ panels               = [%0A                    # (6 unchanged elements hidden)%0A                    {%0A                        datasource  = {%0A                            type = "cloudwatch"%0A                            uid  = "XnMFKnQVk"%0A                        }%0A                        fieldConfig = {%0A                            defaults  = {%0A                                color      = {%0A                                    mode = "palette-classic"%0A                                }%0A                                custom     = {%0A                                    axisLabel         = ""%0A                                    axisPlacement     = "auto"%0A                                    barAlignment      = 0%0A                                    drawStyle         = "line"%0A                                    fillOpacity       = 0%0A                                    gradientMode      = "none"%0A                                    hideFrom          = {%0A                                        legend  = false%0A                                        tooltip = false%0A                                        viz     = false%0A                                    }%0A                                    lineInterpolation = "linear"%0A                                    lineWidth         = 1%0A                                    pointSize         = 5%0A                                    scaleDistribution = {%0A                                        type = "linear"%0A                                    }%0A                                    showPoints        = "auto"%0A                                    spanNulls         = false%0A                                    stacking          = {%0A                                        group = "A"%0A                                        mode  = "none"%0A                                    }%0A                                    thresholdsStyle   = {%0A                                        mode = "off"%0A                                    }%0A                                }%0A                                mappings   = []%0A                                thresholds = {%0A                                    mode  = "absolute"%0A                                    steps = [%0A                                        {%0A                                            color = "green"%0A                                            value = null%0A                                        },%0A                                        {%0A                                            color = "red"%0A                                            value = 80%0A                                        },%0A                                    ]%0A                                }%0A                            }%0A                            overrides = []%0A                        }%0A                        gridPos     = {%0A                            h = 9%0A                            w = 7%0A                            x = 0%0A                            y = 18%0A                        }%0A                        options     = {%0A                            legend  = {%0A                                calcs       = []%0A                                displayMode = "list"%0A                                placement   = "bottom"%0A                            }%0A                            tooltip = {%0A                                mode = "single"%0A                                sort = "none"%0A                            }%0A                        }%0A                        targets     = [%0A                            {%0A                                alias            = ""%0A                                datasource       = {%0A                                    type = "cloudwatch"%0A                                    uid  = "XnMFKnQVk"%0A                                }%0A                                dimensions       = {%0A                                    LoadBalancer = "app/staging-push-load-balancer/aea5ef9d0a34453a"%0A                                }%0A                                expression       = ""%0A                                id               = ""%0A                                matchExact       = true%0A                                metricEditorMode = 0%0A                                metricName       = "RequestCount"%0A                                metricQueryType  = 0%0A                                namespace        = "AWS/ApplicationELB"%0A                                period           = ""%0A                                queryMode        = "Metrics"%0A                                refId            = "A"%0A                                region           = "default"%0A                                sqlExpression    = ""%0A                                statistic        = "Sum"%0A                            },%0A                        ]%0A                        title       = "Requests"%0A                        type        = "timeseries"%0A                    },%0A                  ~ {%0A                      ~ alert       = {%0A                          ~ for                 = "5m" -> ""%0A                            name                = "staging Echo Server 5XX alert"%0A                            # (8 unchanged attributes hidden)%0A                        }%0A                      ~ fieldConfig = {%0A                          ~ defaults  = {%0A                              ~ custom     = {%0A                                  ~ hideFrom          = {%0A                                      + mode    = "dashed"%0A                                        # (3 unchanged attributes hidden)%0A                                    }%0A                                    # (14 unchanged attributes hidden)%0A                                }%0A                                # (3 unchanged attributes hidden)%0A                            }%0A                            # (1 unchanged attribute hidden)%0A                        }%0A                      + thresholds  = [%0A                          + {%0A                              + colorMode = "critical"%0A                              + op        = "gt"%0A                              + value     = 1%0A                              + visible   = true%0A                            },%0A                        ]%0A                        # (6 unchanged attributes hidden)%0A                    },%0A                    {%0A                        datasource  = {%0A                            type = "cloudwatch"%0A                            uid  = "XnMFKnQVk"%0A                        }%0A                        fieldConfig = {%0A                            defaults  = {%0A                                color      = {%0A                                    mode = "palette-classic"%0A                                }%0A                                custom     = {%0A                                    axisLabel         = ""%0A                                    axisPlacement     = "auto"%0A                                    barAlignment      = 0%0A                                    drawStyle         = "line"%0A                                    fillOpacity       = 0%0A                                    gradientMode      = "none"%0A                                    hideFrom          = {%0A                                        legend  = false%0A                                        tooltip = false%0A                                        viz     = false%0A                                    }%0A                                    lineInterpolation = "linear"%0A                                    lineWidth         = 1%0A                                    pointSize         = 5%0A                                    scaleDistribution = {%0A                                        type = "linear"%0A                                    }%0A                                    showPoints        = "auto"%0A                                    spanNulls         = false%0A                                    stacking          = {%0A                                        group = "A"%0A                                        mode  = "none"%0A                                    }%0A                                    thresholdsStyle   = {%0A                                        mode = "off"%0A                                    }%0A                                }%0A                                mappings   = []%0A                                thresholds = {%0A                                    mode  = "absolute"%0A                                    steps = [%0A                                        {%0A                                            color = "green"%0A                                            value = null%0A                                        },%0A                                        {%0A                                            color = "red"%0A                                            value = 80%0A                                        },%0A                                    ]%0A                                }%0A                            }%0A                            overrides = []%0A                        }%0A                        gridPos     = {%0A                            h = 9%0A                            w = 7%0A                            x = 14%0A                            y = 18%0A                        }%0A                        options     = {%0A                            legend  = {%0A                                calcs       = []%0A                                displayMode = "list"%0A                                placement   = "bottom"%0A                            }%0A                            tooltip = {%0A                                mode = "single"%0A                                sort = "none"%0A                            }%0A                        }%0A                        targets     = [%0A                            {%0A                                alias            = ""%0A                                datasource       = {%0A                                    type = "cloudwatch"%0A                                    uid  = "XnMFKnQVk"%0A                                }%0A                                dimensions       = {%0A                                    LoadBalancer = "app/staging-push-load-balancer/aea5ef9d0a34453a"%0A                                }%0A                                expression       = ""%0A                                id               = ""%0A                                matchExact       = true%0A                                metricEditorMode = 0%0A                                metricName       = "HTTPCode_ELB_4XX_Count"%0A                                metricQueryType  = 0%0A                                namespace        = "AWS/ApplicationELB"%0A                                period           = ""%0A                                queryMode        = "Metrics"%0A                                refId            = "A"%0A                                region           = "default"%0A                                sqlExpression    = ""%0A                                statistic        = "Sum"%0A                            },%0A                            {%0A                                alias            = ""%0A                                datasource       = {%0A                                    type = "cloudwatch"%0A                                    uid  = "XnMFKnQVk"%0A                                }%0A                                dimensions       = {%0A                                    LoadBalancer = "app/staging-push-load-balancer/aea5ef9d0a34453a"%0A                                }%0A                                expression       = ""%0A                                id               = ""%0A                                matchExact       = true%0A                                metricEditorMode = 0%0A                                metricName       = "HTTPCode_Target_4XX_Count"%0A                                metricQueryType  = 0%0A                                namespace        = "AWS/ApplicationELB"%0A                                period           = ""%0A                                queryMode        = "Metrics"%0A                                refId            = "B"%0A                                region           = "default"%0A                                sqlExpression    = ""%0A                                statistic        = "Sum"%0A                            },%0A                        ]%0A                        title       = "4XX"%0A                        type        = "timeseries"%0A                    },%0A                ]%0A                tags                 = []%0A                # (15 unchanged attributes hidden)%0A            }%0A        )%0A        id           = "0:staging-push"%0A        # (7 unchanged attributes hidden)%0A    }%0A%0APlan: 1 to add, 2 to change, 1 to destroy.%0A
::debug::stderr: 
::debug::exitcode: 0

Action: pull_request

@geekbrother geekbrother requested a review from arein October 21, 2023 14:22
@geekbrother geekbrother marked this pull request as ready for review October 21, 2023 14:22
Copy link
Member

@chris13524 chris13524 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rubber stamping this. Seems right!

@geekbrother geekbrother merged commit a070782 into main Oct 23, 2023
10 checks passed
@chris13524 chris13524 deleted the max/chore/remove_for_from_evaluation branch October 23, 2023 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted The issue has been accepted into the project area-telemetry Metrics & Monitoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fix: alarm notifications disabled
3 participants