Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding method to ack #80

Merged
merged 3 commits into from
Dec 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,21 @@ You can open the match requirement by using the ```--node-count``` option to fin

**_NOTE:_** The ```cmr```, ```--hunter-analyze``` and ```--anomaly-detection``` flags are mutually exclusive. They cannot be used together because they represent different algorithms designed for distinct use cases.

#### Ack known bugs
To ack known regressions, you must provide a yaml file with the uuid and metric which the regression was identified at. Example below

```
---
ack :
- uuid: "af24e294-93da-4729-a9cc-14acf38454e1",
metric: "etcdCPU_avg"
jtaleric marked this conversation as resolved.
Show resolved Hide resolved
reason: "started thread with etcd team"
```

ack'ing regressions will ensure Orion doesn't continue to notify users of the same issues.

Engineers should add a `reason` that could link to a JIRA or Slack thread. Or the `reason` could be the changepoint % diff is too low to alert component teams.

### Daemon mode
The core purpose of Daemon mode is to operate Orion as a self-contained server, dedicated to handling incoming requests. By sending a POST request accompanied by a test name of predefined tests, users can trigger change point detection on the provided metadata and metrics. Following the processing, the response is formatted in JSON, providing a structured output for seamless integration and analysis. To trigger daemon mode just use the following commands

Expand Down
14 changes: 14 additions & 0 deletions ack/example_ack.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
ack :
- uuid: "7f7337aa-cee3-4a36-b154-a7c48ed1fb75"
metric: "etcdCPU_avg"
reason: "Under our 10% target"
- uuid: "22e90f4e-1c79-4d9d-b2f6-b95a7072738c"
metric: "kubelet_avg"
reason: "Opened Dialog with node team"
- uuid: "22e90f4e-1c79-4d9d-b2f6-b95a7072738c"
metric: "ovnCPU_avg"
reason: "OCPBUGS111111"
- uuid: "93201652-b496-4594-b1ac-7eb9a32cd609"
metric: "apiserverCPU_avg"
reason: "opened discussion with api team"
5 changes: 4 additions & 1 deletion orion.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
import uvicorn
from fmatch.logrus import SingletonLogger
from pkg.runTest import run
from pkg.config import load_config
from pkg.config import load_config, load_ack
import pkg.constants as cnsts

warnings.filterwarnings("ignore", message="Unverified HTTPS request.*")
Expand Down Expand Up @@ -78,6 +78,7 @@ def cli(max_content_width=120): # pylint: disable=unused-argument
)
@click.option("--filter", is_flag=True, help="Generate percent difference in comparison")
@click.option("--config", default="config.yaml", help="Path to the configuration file")
@click.option("--ack", default="", help="Optional ack YAML to ack known regressions")
@click.option(
"--save-data-path", default="data.csv", help="Path to save the output file"
)
Expand Down Expand Up @@ -122,6 +123,8 @@ def cmd_analysis(**kwargs):
level = logging.DEBUG if kwargs["debug"] else logging.INFO
logger_instance = SingletonLogger(debug=level, name="Orion")
logger_instance.info("🏹 Starting Orion in command-line mode")
if len(kwargs["ack"]) > 1 :
kwargs["ackMap"] = load_ack(kwargs["ack"])
kwargs["configMap"] = load_config(kwargs["config"])
output, regression_flag = run(**kwargs)
if output is None:
Expand Down
16 changes: 12 additions & 4 deletions pkg/algorithms/edivisive/edivisive.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,23 @@ class EDivisive(Algorithm):
def _analyze(self):
self.dataframe["timestamp"] = pd.to_datetime(self.dataframe["timestamp"])
self.dataframe["timestamp"] = self.dataframe["timestamp"].astype(int) // 10**9
series= self.setup_series()
series = self.setup_series()
change_points_by_metric = series.analyze().change_points

# filter by direction

# Process if we have ack'ed regression
ackSet = set()
if len(self.options["ack"]) > 1 :
for ack in self.options["ackMap"]["ack"]:
pos = series.find_by_attribute("uuid",ack["uuid"])
ackSet.add(str(pos[0]) + "_" + ack["metric"])

# filter by direction and ack'ed issues
for metric, changepoint_list in change_points_by_metric.items():
for i in range(len(changepoint_list)-1, -1, -1):
if ((self.metrics_config[metric]["direction"] == 1 and changepoint_list[i].stats.mean_1 > changepoint_list[i].stats.mean_2) or
(self.metrics_config[metric]["direction"] == -1 and changepoint_list[i].stats.mean_1 < changepoint_list[i].stats.mean_2) ):
if ((self.metrics_config[metric]["direction"] == 1 and changepoint_list[i].stats.mean_1 > changepoint_list[i].stats.mean_2) or (self.metrics_config[metric]["direction"] == -1 and changepoint_list[i].stats.mean_1 < changepoint_list[i].stats.mean_2) or (str(changepoint_list[i].index) + "_" + changepoint_list[i].metric in ackSet)):
del changepoint_list[i]

if [val for li in change_points_by_metric.values() for val in li]:
self.regression_flag=True
return series, change_points_by_metric
17 changes: 17 additions & 0 deletions pkg/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,23 @@ def load_config(config: str, parameters: Dict= None) -> Dict[str, Any]:
rendered_config = yaml.safe_load(rendered_config_yaml)
return rendered_config

def load_ack(ack: str) -> Dict[str,Any]:
logger_instance = SingletonLogger.getLogger("Orion")
try:
with open(ack, "r", encoding="utf-8") as template_file:
template_content = template_file.read()
logger_instance.debug("The %s file has successfully loaded", ack)
except FileNotFoundError as e:
logger_instance.error("Config file not found: %s", e)
sys.exit(1)
except Exception as e: # pylint: disable=broad-exception-caught
logger_instance.error("An error occurred: %s", e)
sys.exit(1)

required_parameters = get_template_variables(template_content)

rendered_config = yaml.safe_load(template_content)
return rendered_config

def get_template_variables(template_content: str) -> Set[str]:
"""Extracts all variables from the Jinja2 template content."""
Expand Down
Loading