Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Response Ops][Alerting] Adding ability to run actions for backfill rule runs #200784

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

ymao1
Copy link
Contributor

@ymao1 ymao1 commented Nov 19, 2024

Resolves https://github.com/elastic/response-ops-team/issues/251

Note

This PR includes some saved object schema changes that I will pull out into their own separate PR in order to perform an intermediate release. I wanted to make sure all the schema changes made sense in the overall context of the PR before opening those separate PRs.

Update: PR for intermediate release here: #203184 (Merged)

Summary

Adds ability to run actions for backfill rule runs.

  • Updates schedule backfill API to accept run_actions parameter to specify whether to run actions for backfill.
  • Schedule API accepts any action where frequency.notifyWhen === 'onActiveAlert'. If a rule has multiple actions where some are onActiveAlert and some are onThrottleInterval, the invalid actions will be stripped and a warning returned in the schedule response but valid actions will be scheduled.
  • Connector IDs are extracted and stored as references in the ad hoc run params saved object
  • Any actions that result from a backfill task run are scheduled as low priority tasks

To Verify

  1. Create a detection rule. Make sure you have some past data that the rule can run over in order to generate actions. Make sure you add actions to the rule. For testing, I added some conditional actions so I could see actions running only on backfill runs using kibana.alert.rule.execution.type: "manual". Create actions with and without summaries.
  2. Schedule a backfill either directly via the API or using the detection UI. Verify that actions are run for the backfill runs that generate alerts.

@ymao1 ymao1 force-pushed the backfill-actions branch 8 times, most recently from 07d9365 to 8010638 Compare November 20, 2024 20:40
@kibanamachine
Copy link
Contributor

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#7453

[✅] x-pack/test/alerting_api_integration/security_and_spaces/group1/config.ts: 200/200 tests passed.

see run history

@ymao1 ymao1 changed the title Adding ability to run actions for backfill rule runs [Response Ops][Alerting] Adding ability to run actions for backfill rule runs Nov 25, 2024
@@ -171,15 +177,17 @@ export function createBulkExecutionEnqueuerFunction({
executionId: actionToExecute.executionId,
consumer: actionToExecute.consumer,
relatedSavedObjects: relatedSavedObjectWithRefs,
...(actionToExecute.apiKeyId ? { apiKeyId: actionToExecute.apiKeyId } : {}),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Storing apiKeyId in the action_task_param saved object so that the API key invalidation task can query for any scheduled actions that are using invalidated API keys and wait before performing invalidation.

@ymao1 ymao1 self-assigned this Nov 25, 2024
@ymao1 ymao1 added Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v9.0.0 backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) v8.18.0 release_note:skip Skip the PR/issue when compiling release notes labels Nov 25, 2024
@ymao1 ymao1 marked this pull request as ready for review November 25, 2024 21:19
@ymao1 ymao1 requested review from a team as code owners November 25, 2024 21:19
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

Comment on lines 131 to 133
if (doc['task.priority'].size() != 0) {
return doc['task.priority'].value;
} else if (params.priority_map.containsKey(taskType)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving the conversation from #199218 (comment) here, sorry for the delay!

Either would work! The way it currently is, we would sort by priority=descending which makes more sense in my head but definitely an argument for the reverse

Hmm, is it easy to check what other queueing products like RabbitMQ, Kafka, etc use for priority order? We can see if they do asc or desc and follow suit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For RabbitMQ: Priorities between 1 and 255 are supported, Larger numbers indicate higher priority.

For ActiveMQ: The full range of priority values (0-9) are supported by the [JDBC](https://activemq.apache.org/components/classic/documentation/jdbc-support) message store. For [KahaDB](https://activemq.apache.org/components/classic/documentation/Persistence/kahadb) three priority categories are supported, Low (< 4), Default (= 4) and High (> 4).

For Redis, priority can be implemented using sorted sets, using the score to represent task priority. For sorted sets, If B and A are two elements with a different score, then A > B if A.score is > B.score.

Kafka and SQS do not support event/message level priority. Instead they both recommend adding different topics and reading from them in priority level (read from high priority topic/queue first, then fallback to medium priority if nothing in the high priority topic/queue, etc).

Seems like in general, higher value = higher priority?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing the research, it sounds in line with the way we have it implemented today so we are good 👍 ❤️

Copy link
Contributor

@jloleysens jloleysens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Limited my review to the model/mappings changes. Thanks for explaining the rationale!

From what I can tell an intermediate release will be required for apiKeyId field on action_task_params 👍🏻

@ymao1
Copy link
Contributor Author

ymao1 commented Dec 2, 2024

@elasticmachine merge upstream

@ymao1 ymao1 requested review from pmuellr and adcoelho December 2, 2024 14:29
Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Detection rule runs backfill actions as expected, very cool. Code LGTM but marked some property setting simplifications (I think they are kosher).

One thing I'm wondering about. To figure out if the action was from a backfill, I used an original_event time value, which seemed to work. I think if I was going to use this, I'd want to have a way to know if this is a backfill, to put some kind of warning in my mustache template like

{{#isBackFill}}**Note: this was generated from a backfill**
{{/isBackFill}}

We can obviously do this later, but I was a little curious and haven't figured out where that would go, in our code ...

@ymao1
Copy link
Contributor Author

ymao1 commented Dec 5, 2024

marked some property setting simplifications

@pmuellr Did you have some comments that got lost somewhere?

@ymao1
Copy link
Contributor Author

ymao1 commented Dec 5, 2024

We can obviously do this later, but I was a little curious and haven't figured out where that would go, in our code ...

Good idea! I can create a follow-up issue for this.

@pmuellr
Copy link
Member

pmuellr commented Dec 5, 2024

Did you have some comments that got lost somewhere?

Welp dang, I did! Must have been the sporadic GH issues from yesterday. Let me see if I can find them, but they were nits anyway ...

@@ -171,15 +177,17 @@ export function createBulkExecutionEnqueuerFunction({
executionId: actionToExecute.executionId,
consumer: actionToExecute.consumer,
relatedSavedObjects: relatedSavedObjectWithRefs,
...(actionToExecute.apiKeyId ? { apiKeyId: actionToExecute.apiKeyId } : {}),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed this pattern used in a few places:

...(actionToExecute.apiKeyId ? { apiKeyId: actionToExecute.apiKeyId } : {}),

I think this can be simplified to

apiKeyId: actionToExecute.apiKeyId,

I'm not aware of - but wouldn't be suprised to find - something sensitive to the property not in the object vs the property having undefined as the value.

ymao1 added a commit that referenced this pull request Dec 11, 2024
…iate release (#203184)

## Summary

This PR contains just the schema changes required to support backfill
actions. This is meant for an intermediate release and then the full PR:
#200784 will follow after that.

---------

Co-authored-by: Elastic Machine <[email protected]>
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Dec 11, 2024
…iate release (elastic#203184)

## Summary

This PR contains just the schema changes required to support backfill
actions. This is meant for an intermediate release and then the full PR:
elastic#200784 will follow after that.

---------

Co-authored-by: Elastic Machine <[email protected]>
(cherry picked from commit b9bac16)
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Dec 12, 2024
…iate release (elastic#203184)

## Summary

This PR contains just the schema changes required to support backfill
actions. This is meant for an intermediate release and then the full PR:
elastic#200784 will follow after that.

---------

Co-authored-by: Elastic Machine <[email protected]>
@ymao1
Copy link
Contributor Author

ymao1 commented Dec 12, 2024

@elasticmachine merge upstream

kibanamachine added a commit that referenced this pull request Dec 12, 2024
…termediate release (#203184) (#203885)

# Backport

This will backport the following commits from `main` to `8.x`:
- [[Response Ops][Alerting] Backfill actions schema changes for
intermediate release
(#203184)](#203184)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Ying
Mao","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-12-11T18:47:01Z","message":"[Response
Ops][Alerting] Backfill actions schema changes for intermediate release
(#203184)\n\n## Summary\r\n\r\nThis PR contains just the schema changes
required to support backfill\r\nactions. This is meant for an
intermediate release and then the full
PR:\r\nhttps://github.com//pull/200784 will follow after
that.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine
<[email protected]>","sha":"b9bac1628bc489efec2da0f8f6fecf66962a6a51","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Feature:Alerting","release_note:skip","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.18.0"],"title":"[Response
Ops][Alerting] Backfill actions schema changes for intermediate
release","number":203184,"url":"https://github.com/elastic/kibana/pull/203184","mergeCommit":{"message":"[Response
Ops][Alerting] Backfill actions schema changes for intermediate release
(#203184)\n\n## Summary\r\n\r\nThis PR contains just the schema changes
required to support backfill\r\nactions. This is meant for an
intermediate release and then the full
PR:\r\nhttps://github.com//pull/200784 will follow after
that.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine
<[email protected]>","sha":"b9bac1628bc489efec2da0f8f6fecf66962a6a51"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/203184","number":203184,"mergeCommit":{"message":"[Response
Ops][Alerting] Backfill actions schema changes for intermediate release
(#203184)\n\n## Summary\r\n\r\nThis PR contains just the schema changes
required to support backfill\r\nactions. This is meant for an
intermediate release and then the full
PR:\r\nhttps://github.com//pull/200784 will follow after
that.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine
<[email protected]>","sha":"b9bac1628bc489efec2da0f8f6fecf66962a6a51"}},{"branch":"8.x","label":"v8.18.0","branchLabelMappingKey":"^v8.18.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Ying Mao <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
@ymao1
Copy link
Contributor Author

ymao1 commented Dec 12, 2024

One thing I'm wondering about. To figure out if the action was from a backfill, I used an original_event time value, which seemed to work. I think if I was going to use this, I'd want to have a way to know if this is a backfill, to put some kind of warning in my mustache template like

{{#isBackFill}}**Note: this was generated from a backfill**
{{/isBackFill}}

We can obviously do this later, but I was a little curious and haven't figured out where that would go, in our code ...

We have a field in our alert document: kibana.alert.rule.execution.type which can be either "manual" or "scheduled". Seems like we could do conditional actions to add 2 actions that print different messages based on whether the run was from normal execution or a backfill.

@pmuellr
Copy link
Member

pmuellr commented Dec 12, 2024

We have a field in our alert document: kibana.alert.rule.execution.type which can be either "manual" or "scheduled". Seems like we could do conditional actions to add 2 actions that print different messages based on whether the run was from normal execution or a backfill.

Does "manual" already have a use? Can we use a new value "backfill"?

@ymao1
Copy link
Contributor Author

ymao1 commented Dec 13, 2024

Does "manual" already have a use? Can we use a new value "backfill"?

In this case manual = backfill. The detections UI calls them manual rule runs so they wanted to set the value this way so it shows up consistently in their UI.

@pmuellr
Copy link
Member

pmuellr commented Dec 13, 2024

Seems like we could do conditional actions to add 2 actions that print different messages based on whether the run was from normal execution or a backfill.

Ya, that should work.

I was actually thinking about context variables I guess, and not alert fields. Still think we can just do this later, and we may get some requirements for it as well ...

@elasticmachine
Copy link
Contributor

elasticmachine commented Jan 2, 2025

💔 Build Failed

Failed CI Steps

History

cc @ymao1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) Feature:Alerting release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants