diff --git a/README.md b/README.md index 4b47ff5..f21dab9 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,13 @@ # Orion - CLI tool to find regressions -Orion stands as a powerful command-line tool designed for identifying regressions within perf-scale CPT runs, leveraging metadata provided during the process. The detection mechanism relies on [hunter](https://github.com/datastax-labs/hunter). +Orion stands as a powerful command-line tool/daemon designed for identifying regressions within perf-scale CPT runs, leveraging metadata provided during the process. The detection mechanism relies on [hunter](https://github.com/datastax-labs/hunter). Below is an illustrative example of the config and metadata that Orion can handle: ``` tests : - name : aws-small-scale-cluster-density-v2 + index: ospst-perf-scale-ci-* + benchmarkIndex: ospst-ripsaw-kube-burner* metadata: platform: AWS masterNodesType: m6a.xlarge @@ -13,7 +15,7 @@ tests : workerNodesType: m6a.xlarge workerNodesCount: 24 benchmark.keyword: cluster-density-v2 - ocpVersion: 4.15 + ocpVersion: {{ version }} networkType: OVNKubernetes # encrypted: true # fips: false @@ -50,14 +52,14 @@ tests : agg: value: cpu agg_type: avg - - - name: etcdDisck + + - name: etcdDisk metricName : 99thEtcdDiskBackendCommitDurationSeconds metric_of_interest: value agg: value: duration agg_type: avg - + ``` ## Build Orion @@ -65,22 +67,25 @@ Building Orion is a straightforward process. Follow these commands: **Note: Orion Compatibility** -Orion currently supports Python versions `3.8.x`, `3.9.x`, `3.10.x`, and `3.11.x`. Please be aware that using other Python versions might lead to dependency conflicts caused by hunter, creating a challenging situation known as "dependency hell." It's crucial to highlight that Python `3.12.x` may result in errors due to the removal of distutils, a dependency used by numpy. This information is essential to ensure a smooth experience with Orion and avoid potential compatibility issues. +Orion currently supports Python version `3.11.x`. Please be aware that using other Python versions might lead to dependency conflicts caused by hunter, creating a challenging situation known as "dependency hell." It's crucial to highlight that Python `3.12.x` may result in errors due to the removal of distutils, a dependency used by numpy. This information is essential to ensure a smooth experience with Orion and avoid potential compatibility issues. Clone the current repository using git clone. ``` >> git clone ->> pip install venv +>> python3 -m venv venv >> source venv/bin/activate >> pip install -r requirements.txt >> export ES_SERVER = >> pip install . ``` ## Run Orion -Executing Orion is as simple as building it. After following the build steps, run the following: +Executing Orion is as seamless as its building it. With the latest enhancements, Orion introduces a versatile command-line option and a Daemon mode, empowering users to select the mode that aligns perfectly with their requirements. + +### Command-line mode +Running Orion in command-line Mode is straightforward. Simply follow these instructions: ``` ->> orion +>> orion cmd --hunter-analyze ``` At the moment, @@ -92,7 +97,98 @@ Activate Orion's regression detection tool for performance-scale CPT runs effort Additionally, users can specify a custom path for the output CSV file using the ```--output``` flag, providing control over the location where the generated CSV will be stored. +### Daemon mode +The core purpose of Daemon mode is to operate Orion as a self-contained server, dedicated to handling incoming requests. By sending a POST request accompanied by a test name of predefined tests, users can trigger change point detection on the provided metadata and metrics. Following the processing, the response is formatted in JSON, providing a structured output for seamless integration and analysis. To trigger daemon mode just use the following commands + +``` +>> orion daemon +``` +**Querying a Test Request to the Daemon Service** + +To interact with the Daemon Service, you can send a POST request using `curl` with specific parameters. + +*Request URL* + +``` +POST http://127.0.0.1:8000/daemon +``` + +*Parameters* + +- uuid (optional): The uuid of the run you want to compare with similar runs. +- baseline (optional): The runs you want to compare with. +- version (optional): The ocpVersion you want to use for metadata defaults to `4.15` +- filter_changepoints (optional): set to `true` if you only want changepoints to show up in the response +- test_name (optional): name of the test you want to perform defaults to `small-scale-cluster-density` + + +Example +``` +curl -L -X POST 'http://127.0.0.1:8000/daemon?filter_changepoints=true&version=4.14&test_name=small-scale-node-density-cni' +``` + +Below is a sample output structure: the top level of the JSON contains the test name, while within each test, runs are organized into arrays. Each run includes succinct metadata alongside corresponding metrics for comprehensive analysis. +``` +{ + "aws-small-scale-cluster-density-v2": [ + { + "uuid": "4cb3efec-609a-4ac5-985d-4cbbcbb11625", + "timestamp": 1704889895, + "buildUrl": "https://tinyurl.com/2ya4ka9z", + "metrics": { + "ovnCPU_avg": { + "value": 2.8503958847, + "percentage_change": 0 + }, + "apiserverCPU_avg": { + "value": 10.2344511574, + "percentage_change": 0 + }, + "etcdCPU_avg": { + "value": 8.7663162253, + "percentage_change": 0 + }, + "P99": { + "value": 13000, + "percentage_change": 0 + } + }, + "is_changepoint": false + }, + ] +} +``` + + +**Querying List of Tests Available to the Daemon Service** + +To list the tests available, you can send a GET request using `curl`. + +*Request URL* + +``` +GET http://127.0.0.1:8000/daemon/options +``` + +*Request Body* + +The request body should contain the file you want to submit for processing. Ensure that the file is in the proper format (e.g., YAML). + +Example +``` +curl -L 'http://127.0.0.1:8000/daemon/options' +``` + +Below is a sample output structure: It contains the opinionated approach list of files available +``` +{ + "options": [ + "small-scale-cluster-density", + "small-scale-node-density-cni" + ] +} +``` Orion's seamless integration with metadata and hunter ensures a robust regression detection tool for perf-scale CPT runs. @@ -101,7 +197,9 @@ Orion's seamless integration with metadata and hunter ensures a robust regressio ``` tests : - - name : current-uuid-etcd-duration + - name : aws-small-scale-cluster-density-v2 + index: ospst-perf-scale-ci-* + benchmarkIndex: ospst-ripsaw-kube-burner* metrics : - name: etcdDisck metricName : 99thEtcdDiskBackendCommitDurationSeconds @@ -111,5 +209,7 @@ tests : agg_type: avg ``` -Orion provides flexibility if you know the comparison uuid you want to compare among, use the ```--baseline``` flag. This should only be used in conjunction when setting uuid. Similar to the uuid section mentioned above, you'll have to set a metrics section to specify the data points you want to collect on +Orion provides flexibility if you know the comparison uuid you want to compare among, use the ```--baseline``` flag. This should only be used in conjunction when setting uuid. Similar to the uuid section mentioned above, you'll have to set a metrics section to specify the data points you want to collect on. + +`--uuid` and `--baseline` options are available both in cmd and daemon mode diff --git a/utils/__init__.py b/configs/__init__.py similarity index 100% rename from utils/__init__.py rename to configs/__init__.py diff --git a/configs/small-scale-cluster-density.yml b/configs/small-scale-cluster-density.yml new file mode 100644 index 0000000..63646b2 --- /dev/null +++ b/configs/small-scale-cluster-density.yml @@ -0,0 +1,56 @@ +# This is a template file +tests : + - name : aws-small-scale-cluster-density-v2 + index: ospst-perf-scale-ci-* + benchmarkIndex: ospst-ripsaw-kube-burner* + metadata: + platform: AWS + masterNodesType: m6a.xlarge + masterNodesCount: 3 + workerNodesType: m6a.xlarge + workerNodesCount: 24 + benchmark.keyword: cluster-density-v2 + ocpVersion: {{ version }} + networkType: OVNKubernetes + # encrypted: true + # fips: false + # ipsec: false + + metrics : + - name: podReadyLatency + metricName: podLatencyQuantilesMeasurement + quantileName: Ready + metric_of_interest: P99 + not: + jobConfig.name: "garbage-collection" + + - name: apiserverCPU + metricName : containerCPU + labels.namespace.keyword: openshift-kube-apiserver + metric_of_interest: value + agg: + value: cpu + agg_type: avg + + - name: ovnCPU + metricName : containerCPU + labels.namespace.keyword: openshift-ovn-kubernetes + metric_of_interest: value + agg: + value: cpu + agg_type: avg + + - name: etcdCPU + metricName : containerCPU + labels.namespace.keyword: openshift-etcd + metric_of_interest: value + agg: + value: cpu + agg_type: avg + + - name: etcdDisk + metricName : 99thEtcdDiskBackendCommitDurationSeconds + metric_of_interest: value + agg: + value: duration + agg_type: avg diff --git a/configs/small-scale-node-density-cni.yml b/configs/small-scale-node-density-cni.yml new file mode 100644 index 0000000..48e3622 --- /dev/null +++ b/configs/small-scale-node-density-cni.yml @@ -0,0 +1,59 @@ +# This is a template file +tests : + - name : aws-small-scale-node-density-cni + index: ospst-perf-scale-ci-* + benchmarkIndex: ospst-ripsaw-kube-burner* + metadata: + platform: AWS + masterNodesType: m6a.xlarge + masterNodesCount: 3 + workerNodesType: m6a.xlarge + workerNodesCount: 6 + infraNodesCount: 3 + benchmark.keyword: node-density-cni + ocpVersion: {{ version }} + networkType: OVNKubernetes + infraNodesType: r5.2xlarge + # encrypted: true + # fips: false + # ipsec: false + + metrics : + - name: podReadyLatency + metricName: podLatencyQuantilesMeasurement + quantileName: Ready + metric_of_interest: P99 + not: + jobConfig.name: "garbage-collection" + + - name: apiserverCPU + metricName : containerCPU + labels.namespace.keyword: openshift-kube-apiserver + metric_of_interest: value + agg: + value: cpu + agg_type: avg + + - name: ovnCPU + metricName : containerCPU + labels.namespace.keyword: openshift-ovn-kubernetes + metric_of_interest: value + agg: + value: cpu + agg_type: avg + + - name: etcdCPU + metricName : containerCPU + labels.namespace.keyword: openshift-etcd + metric_of_interest: value + agg: + value: cpu + agg_type: avg + + - name: etcdDisk + metricName : 99thEtcdDiskBackendCommitDurationSeconds + metric_of_interest: value + agg: + value: duration + agg_type: avg + diff --git a/examples/small-scale-node-density-cni.yaml b/examples/small-scale-node-density-cni.yaml index 28536d1..64b99f4 100644 --- a/examples/small-scale-node-density-cni.yaml +++ b/examples/small-scale-node-density-cni.yaml @@ -10,7 +10,7 @@ tests : workerNodesCount: 6 infraNodesCount: 3 benchmark.keyword: node-density-cni - ocpVersion: 4.15 + ocpVersion: 4.14 networkType: OVNKubernetes infraNodesType: r5.2xlarge # encrypted: true diff --git a/orion.py b/orion.py index 064046e..223f49f 100644 --- a/orion.py +++ b/orion.py @@ -3,135 +3,79 @@ """ # pylint: disable = import-error +import logging import sys import warnings -from functools import reduce -import logging -import os -import re -import pyshorteners - import click -import pandas as pd - -from fmatch.matcher import Matcher -from utils import orion_funcs +import uvicorn +from pkg.logrus import SingletonLogger +from pkg.runTest import run +from pkg.utils import load_config warnings.filterwarnings("ignore", message="Unverified HTTPS request.*") + @click.group() -# pylint: disable=unused-argument -def cli(max_content_width=120): +def cli(max_content_width=120): # pylint: disable=unused-argument """ - cli function to group commands + Orion is a tool which can run change point detection for set of runs using statistical models """ -# pylint: disable=too-many-locals, too-many-statements -@click.command() -@click.option("--uuid", default="", help="UUID to use as base for comparisons") -@click.option("--baseline", default="", help="Baseline UUID(s) to to compare against uuid") +# pylint: disable=too-many-locals +@cli.command(name="cmd") @click.option("--config", default="config.yaml", help="Path to the configuration file") -@click.option("--output-path", default="output.csv", help="Path to save the output csv file") -@click.option("--debug", is_flag=True, help="log level ") -@click.option("--hunter-analyze",is_flag=True, help="run hunter analyze") +@click.option( + "--output-path", default="output.csv", help="Path to save the output csv file" +) +@click.option("--debug", default=False, is_flag=True, help="log level") +@click.option("--hunter-analyze", is_flag=True, help="run hunter analyze") @click.option( "-o", - "--output", + "--output-format", type=click.Choice(["json", "text"]), default="text", help="Choose output format (json or text)", ) -def orion(**kwargs): - """Orion is the cli tool to detect regressions over the runs - - \b - Args: - uuid (str): gather metrics based on uuid - baseline (str): baseline uuid to compare against uuid (uuid must be set when using baseline) - config (str): path to the config file - debug (bool): lets you log debug mode - output (str): path to the output csv file - hunter_analyze (bool): turns on hunter analysis of gathered uuid(s) data +@click.option("--uuid", default="", help="UUID to use as base for comparisons") +@click.option( + "--baseline", default="", help="Baseline UUID(s) to to compare against uuid" +) +def cmd_analysis(**kwargs): + """ + Orion runs on command line mode, and helps in detecting regressions """ - level = logging.DEBUG if kwargs["debug"] else logging.INFO - logger = logging.getLogger("Orion") - logger = orion_funcs.set_logging(level, logger) - data = orion_funcs.load_config(kwargs["config"],logger) - ES_URL=None - - if "ES_SERVER" in data.keys(): - ES_URL = data["ES_SERVER"] - else: - if "ES_SERVER" in os.environ: - ES_URL = os.environ.get("ES_SERVER") - else: - logger.error("ES_SERVER environment variable/config variable not set") - sys.exit(1) - shortener = pyshorteners.Shortener() - for test in data["tests"]: - benchmarkIndex=test['benchmarkIndex'] - uuid = kwargs["uuid"] - baseline = kwargs["baseline"] - fingerprint_index = test["index"] - match = Matcher(index=fingerprint_index, - level=level, ES_URL=ES_URL, verify_certs=False) - if uuid == "": - metadata = orion_funcs.get_metadata(test, logger) - else: - metadata = orion_funcs.filter_metadata(uuid,match,logger) - - logger.info("The test %s has started", test["name"]) - if baseline == "": - runs = match.get_uuid_by_metadata(metadata) - uuids = [run["uuid"] for run in runs] - buildUrls = {run["uuid"]: run["buildUrl"] for run in runs} - if len(uuids) == 0: - logging.info("No UUID present for given metadata") - sys.exit() - else: - uuids = [uuid for uuid in re.split(' |,',baseline) if uuid] - uuids.append(uuid) - buildUrls = orion_funcs.get_build_urls(fingerprint_index, uuids,match) - - fingerprint_index=benchmarkIndex - if metadata["benchmark.keyword"] in ["ingress-perf","k8s-netperf"] : - ids = uuids - else: - if baseline == "": - runs = match.match_kube_burner(uuids, fingerprint_index) - ids = match.filter_runs(runs, runs) - else: - ids = uuids - metrics = test["metrics"] - dataframe_list = orion_funcs.get_metric_data(ids, fingerprint_index, metrics, match, logger) - - for i, df in enumerate(dataframe_list): - if i != 0 and ('timestamp' in df.columns): - dataframe_list[i] = df.drop(columns=['timestamp']) - - merged_df = reduce( - lambda left, right: pd.merge(left, right, on="uuid", how="inner"), - dataframe_list, - ) - - shortener = pyshorteners.Shortener() - merged_df["buildUrl"] = merged_df["uuid"].apply( - lambda uuid: shortener.tinyurl.short(buildUrls[uuid])) #pylint: disable = cell-var-from-loop - csv_name = kwargs["output_path"].split(".")[0]+"-"+test['name']+".csv" - match.save_results( - merged_df, csv_file_path=csv_name - ) - - if kwargs["hunter_analyze"]: - orion_funcs.run_hunter_analyze(merged_df,test,kwargs["output"]) + logger_instance = SingletonLogger(debug=level).logger + logger_instance.info("🏹 Starting Orion in command-line mode") + kwargs["configMap"] = load_config(kwargs["config"]) + output = run(**kwargs) + if output is None: + logger_instance.error("Terminating test") + sys.exit(0) + for test_name, result_table in output.items(): + print(test_name) + print("=" * len(test_name)) + print(result_table) + + +@cli.command(name="daemon") +@click.option("--debug", default=False, is_flag=True, help="log level") +def rundaemon(debug): + """ + Orion runs on daemon mode on port 8000 + \b + """ + level = logging.DEBUG if debug else logging.INFO + logger_instance = SingletonLogger(debug=level).logger + logger_instance.info("🏹 Starting Orion in Daemon mode") + uvicorn.run("pkg.daemon:app", port=8000) if __name__ == "__main__": if len(sys.argv) <= 1: - cli.main(['--help']) + cli.main(["--help"]) else: - print(len(sys.argv)) - cli.add_command(orion) + cli.add_command(cmd_analysis) + cli.add_command(rundaemon) cli() diff --git a/pkg/__init__.py b/pkg/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/pkg/daemon.py b/pkg/daemon.py new file mode 100644 index 0000000..7a77a0e --- /dev/null +++ b/pkg/daemon.py @@ -0,0 +1,102 @@ +""" +Module to run orion in daemon mode +""" + +import logging +import os + +from fastapi import FastAPI, HTTPException +from jinja2 import Template +import pkg_resources +import yaml +from pkg.logrus import SingletonLogger + +from . import runTest + +app = FastAPI() +logger_instance = SingletonLogger(debug=logging.INFO).logger + + +@app.post("/daemon") +async def daemon( + version: str = "4.15", + uuid: str = "", + baseline: str = "", + filter_changepoints="", + test_name="small-scale-cluster-density", +): + """starts listening on port 8000 on url /daemon + + Args: + file (UploadFile, optional): config file for the test. Defaults to File(...). + + Returns: + json: json object of the changepoints and metrics + """ + parameters = {"version": version} + config_file_name=test_name+".yml" + argDict = { + "config": config_file_name, + "output_path": "output.csv", + "hunter_analyze": True, + "output_format": "json", + "uuid": uuid, + "baseline": baseline, + "configMap": render_template(config_file_name, parameters), + } + filter_changepoints = ( + True if filter_changepoints == "true" else False # pylint: disable = R1719 + ) + result = runTest.run(**argDict) + if result is None: + return {"Error":"No UUID with given metadata"} + if filter_changepoints: + for key, value in result.items(): + result[key] = list(filter(lambda x: x.get("is_changepoint", False), value)) + return result + + +@app.get("/daemon/options") +async def get_options(): + """Lists all the tests available in daemon mode + + Raises: + HTTPException: Config not found + HTTPException: cannot find files for config + + Returns: + config: list of files + """ + config_dir = pkg_resources.resource_filename("configs", "") + if not os.path.isdir(config_dir): + raise HTTPException(status_code=404, detail="Config directory not found") + try: + files = [ + os.path.splitext(file)[0] + for file in os.listdir(config_dir) + if file != "__init__.py" + and not file.endswith(".pyc") + and file != "__pycache__" + ] + return {"options": files} + except Exception as e: + raise HTTPException(status_code=500, detail=str(e)) from e + + +def render_template(test_name, parameters): + """replace parameters in the config file + + Args: + file_name (str): the config file + parameters (dict): parameters to be replaces + + Returns: + dict: configMap in dict + """ + config_path = pkg_resources.resource_filename("configs", test_name) + with open(config_path, "r", encoding="utf-8") as template_file: + template_content = template_file.read() + template = Template(template_content) + rendered_config_yaml = template.render(parameters) + rendered_config = yaml.safe_load(rendered_config_yaml) + return rendered_config diff --git a/pkg/logrus.py b/pkg/logrus.py new file mode 100644 index 0000000..d9c9539 --- /dev/null +++ b/pkg/logrus.py @@ -0,0 +1,45 @@ +""" +Logger for orion +""" + +import logging +import sys + + +class SingletonLogger: + """Singleton logger to set logging at one single place + + Returns: + _type_: _description_ + """ + + _instance = None + + def __new__(cls, debug): + if cls._instance is None: + cls._instance = super().__new__(cls) + cls._instance._logger = cls._initialize_logger(debug) + return cls._instance + + @staticmethod + def _initialize_logger(debug): + level = debug # if debug else logging.INFO + logger = logging.getLogger("Orion") + logger.setLevel(level) + handler = logging.StreamHandler(sys.stdout) + handler.setLevel(level) + formatter = logging.Formatter( + "%(asctime)s - %(name)s - %(levelname)s - file: %(filename)s - line: %(lineno)d - %(message)s" # pylint: disable = line-too-long + ) + handler.setFormatter(formatter) + logger.addHandler(handler) + return logger + + @property + def logger(self): + """property to return logger, getter method + + Returns: + _type_: _description_ + """ + return self._logger # pylint: disable = no-member diff --git a/pkg/runTest.py b/pkg/runTest.py new file mode 100644 index 0000000..0a19401 --- /dev/null +++ b/pkg/runTest.py @@ -0,0 +1,42 @@ +""" +run test +""" + +import logging +from fmatch.matcher import Matcher +from pkg.logrus import SingletonLogger +from pkg.utils import run_hunter_analyze, get_es_url, process_test + + +def run(**kwargs): + """run method to start the tests + + Args: + config (_type_): file path to config file + output_path (_type_): output path to save the data + hunter_analyze (_type_): changepoint detection through hunter. defaults to True + output_format (_type_): output to be table or json + + Returns: + _type_: _description_ + """ + logger_instance = SingletonLogger(debug=logging.INFO).logger + data = kwargs["configMap"] + + ES_URL = get_es_url(data) + result_output = {} + for test in data["tests"]: + match = Matcher( + index=test["index"], level=logger_instance.level, ES_URL=ES_URL, verify_certs=False + ) + result = process_test( + test, match, kwargs["output_path"], kwargs["uuid"], kwargs["baseline"] + ) + if result is None: + return None + if kwargs["hunter_analyze"]: + testname, result_data = run_hunter_analyze( + result, test, output=kwargs["output_format"] + ) + result_output[testname] = result_data + return result_output diff --git a/utils/orion_funcs.py b/pkg/utils.py similarity index 59% rename from utils/orion_funcs.py rename to pkg/utils.py index 995a2ff..bf606bf 100644 --- a/utils/orion_funcs.py +++ b/pkg/utils.py @@ -4,8 +4,11 @@ """ # pylint: disable = import-error +from functools import reduce import json import logging +import os +import re import sys import yaml @@ -13,6 +16,12 @@ from hunter.report import Report, ReportType from hunter.series import Metric, Series +import pyshorteners + +from pkg.logrus import SingletonLogger + + + def run_hunter_analyze(merged_df, test, output): @@ -48,13 +57,15 @@ def run_hunter_analyze(merged_df, test, output): report = Report(series, change_points) if output == "text": output_table = report.produce_report( - test_name="test", report_type=ReportType.LOG + test_name=test["name"], report_type=ReportType.LOG ) - print(output_table) - elif output == "json": + return test["name"], output_table + + if output == "json": change_points_by_metric = series.analyze().change_points output_json = parse_json_output(merged_df, change_points_by_metric) - print(json.dumps(output_json, indent=4)) + return test["name"], output_json + return None def parse_json_output(merged_df, change_points_by_metric): @@ -91,7 +102,7 @@ def parse_json_output(merged_df, change_points_by_metric): # pylint: disable=too-many-locals -def get_metric_data(ids, index, metrics, match, logger): +def get_metric_data(ids, index, metrics, match): """Gets details metrics basked on metric yaml list Args: @@ -104,10 +115,11 @@ def get_metric_data(ids, index, metrics, match, logger): Returns: dataframe_list: dataframe of the all metrics """ + logger_instance= SingletonLogger(debug=logging.INFO).logger dataframe_list = [] for metric in metrics: metric_name = metric["name"] - logger.info("Collecting %s", metric_name) + logger_instance.info("Collecting %s", metric_name) metric_of_interest = metric["metric_of_interest"] if "agg" in metric.keys(): @@ -120,10 +132,10 @@ def get_metric_data(ids, index, metrics, match, logger): cpu_df= cpu_df.drop_duplicates(subset=['uuid'],keep='first') cpu_df = cpu_df.rename(columns={agg_name: metric_name + "_" + agg_type}) dataframe_list.append(cpu_df) - logger.debug(cpu_df) + logger_instance.debug(cpu_df) except Exception as e: # pylint: disable=broad-exception-caught - logger.error( + logger_instance.error( "Couldn't get agg metrics %s, exception %s", metric_name, e, @@ -134,13 +146,13 @@ def get_metric_data(ids, index, metrics, match, logger): podl_df = match.convert_to_df( podl, columns=["uuid", "timestamp", metric_of_interest] ) - podl_df= podl_df.drop_duplicates(subset=['uuid'],keep='first') - podl_df = podl_df.rename(columns={metric_of_interest: - metric_name + "_" + metric_of_interest}) + podl_df = podl_df.rename( + columns={metric_of_interest: metric_name+"_"+metric_of_interest}) + podl_df=podl_df.drop_duplicates() dataframe_list.append(podl_df) - logger.debug(podl_df) + logger_instance.debug(podl_df) except Exception as e: # pylint: disable=broad-exception-caught - logger.error( + logger_instance.error( "Couldn't get metrics %s, exception %s", metric_name, e, @@ -148,7 +160,7 @@ def get_metric_data(ids, index, metrics, match, logger): return dataframe_list -def get_metadata(test, logger): +def get_metadata(test): """Gets metadata of the run from each test Args: @@ -157,11 +169,76 @@ def get_metadata(test, logger): Returns: dict: dictionary of the metadata """ + logger_instance= SingletonLogger(debug=logging.INFO).logger metadata = test["metadata"] metadata["ocpVersion"] = str(metadata["ocpVersion"]) - logger.debug("metadata" + str(metadata)) + logger_instance.debug("metadata" + str(metadata)) return metadata + +def load_config(config): + """Loads config file + + Args: + config (str): path to config file + logger (Logger): logger + + Returns: + dict: dictionary of the config file + """ + logger_instance= SingletonLogger(debug=logging.INFO).logger + try: + with open(config, "r", encoding="utf-8") as file: + data = yaml.safe_load(file) + logger_instance.debug("The %s file has successfully loaded", config) + except FileNotFoundError as e: + logger_instance.error("Config file not found: %s", e) + sys.exit(1) + except Exception as e: # pylint: disable=broad-exception-caught + logger_instance.error("An error occurred: %s", e) + sys.exit(1) + return data + + +def get_es_url(data): + """Gets es url from config or env + + Args: + data (_type_): config file data + logger (_type_): logger + + Returns: + str: es url + """ + logger_instance= SingletonLogger(debug=logging.INFO).logger + if "ES_SERVER" in data.keys(): + return data["ES_SERVER"] + if "ES_SERVER" in os.environ: + return os.environ.get("ES_SERVER") + logger_instance.error("ES_SERVER environment variable/config variable not set") + sys.exit(1) + + +def get_ids_from_index(metadata, fingerprint_index, uuids, match, baseline): + """returns the index to be used and runs as uuids + + Args: + metadata (_type_): metadata from config + uuids (_type_): uuids collected + match (_type_): Matcher object + + Returns: + _type_: index and uuids + """ + if metadata["benchmark.keyword"] in ["ingress-perf","k8s-netperf"] : + return uuids + if baseline == "": + runs = match.match_kube_burner(uuids,fingerprint_index) + ids = match.filter_runs(runs, runs) + else: + ids = uuids + return ids + def get_build_urls(index, uuids,match): """Gets metadata of the run from each test to get the build url @@ -179,7 +256,64 @@ def get_build_urls(index, uuids,match): buildUrls = {run["uuid"]: run["buildUrl"] for run in test} return buildUrls -def filter_metadata(uuid,match,logger): + +def process_test(test, match, output, uuid, baseline): + """generate the dataframe for the test given + + Args: + test (_type_): test from process test + match (_type_): matcher object + logger (_type_): logger object + output (_type_): output file name + + Returns: + _type_: merged dataframe + """ + logger_instance= SingletonLogger(debug=logging.INFO).logger + benchmarkIndex=test['benchmarkIndex'] + fingerprint_index=test['index'] + if uuid in ('', None): + metadata = get_metadata(test) + else: + metadata = filter_metadata(uuid,match) + logger_instance.info("The test %s has started", test["name"]) + runs = match.get_uuid_by_metadata(metadata) + uuids = [run["uuid"] for run in runs] + buildUrls = {run["uuid"]: run["buildUrl"] for run in runs} + if baseline in ('', None): + runs = match.get_uuid_by_metadata(metadata) + uuids = [run["uuid"] for run in runs] + buildUrls = {run["uuid"]: run["buildUrl"] for run in runs} + if len(uuids) == 0: + logger_instance.error("No UUID present for given metadata") + return None + else: + uuids = [uuid for uuid in re.split(' |,',baseline) if uuid] + uuids.append(uuid) + buildUrls = get_build_urls(fingerprint_index, uuids,match) + fingerprint_index=benchmarkIndex + ids = get_ids_from_index(metadata, fingerprint_index, uuids, match, baseline) + + metrics = test["metrics"] + dataframe_list = get_metric_data(ids, fingerprint_index, metrics, match) + + for i, df in enumerate(dataframe_list): + if i != 0 and ('timestamp' in df.columns): + dataframe_list[i] = df.drop(columns=['timestamp']) + + merged_df = reduce( + lambda left, right: pd.merge(left, right, on="uuid", how="inner"), + dataframe_list, + ) + shortener = pyshorteners.Shortener(timeout=10) + merged_df["buildUrl"] = merged_df["uuid"].apply( + lambda uuid: shortener.tinyurl.short(buildUrls[uuid]) #pylint: disable = cell-var-from-loop + ) + output_file_path = output.split(".")[0] + "-" + test["name"] + ".csv" + match.save_results(merged_df, csv_file_path=output_file_path) + return merged_df + +def filter_metadata(uuid,match): """Gets metadata of the run from each test Args: @@ -190,7 +324,7 @@ def filter_metadata(uuid,match,logger): Returns: dict: dictionary of the metadata """ - + logger_instance= SingletonLogger(debug=logging.INFO).logger test = match.get_metadata_by_uuid(uuid) metadata = { 'platform': '', @@ -220,50 +354,5 @@ def filter_metadata(uuid,match,logger): #Remove any keys that have blank values no_blank_meta = {k: v for k, v in metadata.items() if v} - logger.debug('No blank metadata dict: ' + str(no_blank_meta)) + logger_instance.debug('No blank metadata dict: ' + str(no_blank_meta)) return no_blank_meta - - - -def set_logging(level, logger): - """sets log level and format - - Args: - level (_type_): level of the log - logger (_type_): logger object - - Returns: - logging.Logger: a formatted and level set logger - """ - logger.setLevel(level) - handler = logging.StreamHandler(sys.stdout) - handler.setLevel(level) - formatter = logging.Formatter( - "%(asctime)s [%(name)s:%(filename)s:%(lineno)d] %(levelname)s: %(message)s" - ) - handler.setFormatter(formatter) - logger.addHandler(handler) - return logger - - -def load_config(config, logger): - """Loads config file - - Args: - config (str): path to config file - logger (Logger): logger - - Returns: - dict: dictionary of the config file - """ - try: - with open(config, "r", encoding="utf-8") as file: - data = yaml.safe_load(file) - logger.debug("The %s file has successfully loaded", config) - except FileNotFoundError as e: - logger.error("Config file not found: %s", e) - sys.exit(1) - except Exception as e: # pylint: disable=broad-exception-caught - logger.error("An error occurred: %s", e) - sys.exit(1) - return data diff --git a/requirements.txt b/requirements.txt index d3224dd..f217a61 100644 --- a/requirements.txt +++ b/requirements.txt @@ -4,10 +4,15 @@ click==8.1.7 elastic-transport==8.11.0 elasticsearch==7.13.0 fmatch==0.0.7 +Jinja2==3.1.3 python-dateutil==2.8.2 pytz==2023.3.post1 PyYAML==6.0.1 six==1.16.0 tzdata==2023.4 urllib3==1.26.18 -pyshorteners==1.0.1 \ No newline at end of file +pyshorteners==1.0.1 +fastapi==0.110.0 +pyshorteners==1.0.1 +python-multipart==0.0.9 +uvicorn==0.28.0 diff --git a/setup.py b/setup.py index 52fdafc..bfbb8ce 100644 --- a/setup.py +++ b/setup.py @@ -14,11 +14,12 @@ ], entry_points={ 'console_scripts': [ - 'orion = orion:orion', + 'orion = orion:cli', ], }, packages=find_packages(), - package_data={'utils': ['utils.py'],'hunter': ['*.py']}, + package_data={'pkg': ['utils.py',"runTest.py","daemon.py","logrus.py"], + 'configs':['*.yml','*.yaml']}, classifiers=[ 'Programming Language :: Python :: 3', 'License :: OSI Approved :: MIT License',