-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Daemon mode for orion #26
Support for Daemon mode for orion #26
Conversation
8966d35
to
46dab64
Compare
Hi @paigerube14 @jtaleric just rebased with latest changes. |
@@ -10,3 +10,5 @@ PyYAML==6.0.1 | |||
six==1.16.0 | |||
tzdata==2023.4 | |||
urllib3==1.26.18 | |||
fastapi==0.110.0 | |||
python-multipart==0.0.9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried installing this latest requirements and got error below when trying to run:
(venv3) prubenda@prubenda1-mac orion % orion --debug
Traceback (most recent call last):
File "/Users/prubenda/PycharmProjects/pplRepos/shashank/orion/venv3/bin/orion", line 5, in <module>
from orion import cli
File "/Users/prubenda/PycharmProjects/pplRepos/shashank/orion/venv3/lib/python3.10/site-packages/orion.py", line 9, in <module>
import uvicorn
ModuleNotFoundError: No module named 'uvicorn'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will add uvicorn to requirements.txt, also planning on integrating pdm so we don't run into requirement issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use orion cmd-mode
or orion daemon-mode
, I added these new commands to differentiate between the two.
README.md
Outdated
@@ -92,6 +95,45 @@ Activate Orion's regression detection tool for performance-scale CPT runs effort | |||
|
|||
Additionally, users can specify a custom path for the output CSV file using the ```--output``` flag, providing control over the location where the generated CSV will be stored. | |||
|
|||
### Daemon mode | |||
The core purpose of Daemon mode is to operate Orion as a self-contained server, dedicated to handling incoming requests. By sending a POST request accompanied by a configuration file, users can trigger change point detection on the provided metadata and metrics. Following the processing, the response is formatted in JSON, providing a structured output for seamless integration and analysis. To trigger daemon mode just use the following commands |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here it would be nice to also have an example post url with a config file before showing the output of it. Just reading this its not clear where to make the post
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it!! will add a post URL for further clarity.
Returns: | ||
_type_: _description_ | ||
""" | ||
logger_instance = SingletonLogger(debug=logging.INFO).logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if this is the issue causing: cloud-bulldozer/py-commons#22 because it is creating a new logger each time a new post is ran in the daemon mode
I would think you would want to try to pass the logger from upper daemon mode setting to here instead of creating a new one every run and in sub functions. IE the daemon mode cli would create 1 single logger and every time orion is called it would use the same one. This would also help with this issue. I might be thinking about it wrong but any thoughts? @jtaleric @shashank-boyapally @vishnuchalla
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @paigerube14 , since we implemented SingletonLogger, it only creates the logger for once, later everywhere we call SingletonLogger(debug=logging.INFO).logger
it returns back the already created logger instead of creating a new one.
Also, since the context(i.e. file and line info) the logger prints is different in fmatch and orion, we could see that only logger of fmatch is duplicating and not orion. I think passing the logger to each function was redundant which was the case in our previous implementation. In order to tackle this SingletonLogger
class was implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, I see. I missed the ".logger" on each of the singletonlogger calls. Thanks for clarifying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more point to consider here: #26 (comment)
README.md
Outdated
*Parameters* | ||
|
||
- uuid (optional): The uuid of the run you want to compare with similar runs. | ||
- baseline (optional): The runs you want to compare. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- baseline (optional): The runs you want to compare. | |
- baseline (optional): The runs you want to compare with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes more sense, will be adding this.
README.md
Outdated
@@ -1,5 +1,5 @@ | |||
# Orion - CLI tool to find regressions | |||
Orion stands as a powerful command-line tool designed for identifying regressions within perf-scale CPT runs, leveraging metadata provided during the process. The detection mechanism relies on [hunter](https://github.com/datastax-labs/hunter). | |||
Orion stands as a powerful command-line tool/Daemon designed for identifying regressions within perf-scale CPT runs, leveraging metadata provided during the process. The detection mechanism relies on [hunter](https://github.com/datastax-labs/hunter). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Daemon/daemon/*
|
||
Below is a sample output structure: the top level of the JSON contains the test name, while within each test, runs are organized into arrays. Each run includes succinct metadata alongside corresponding metrics for comprehensive analysis. | ||
``` | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This output structure has to align with the remaining open PRs, let get them merged one by one. A reasonable order would be
- Json output #21 -> display build URL for changepoint #15 -> and finally this one.
wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO this would be the cleaner approach.
import sys | ||
|
||
|
||
class SingletonLogger: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this SingletonLogger
to py-commons
and re-use it for both fmatch and orion?. Because logger implementation here is thread safe but not here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense.
""" | ||
level = logging.DEBUG if debug else logging.INFO | ||
logger_instance = SingletonLogger(debug=level).logger | ||
logger_instance.info("🏹 Starting Orion in Daemon mode") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if 🏹 is appropriate for orion, may be we want to look for an alternative here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went ahead with bow and arrow as it relates with hunter and orion. Would love to have more suggestions.
orion.py
Outdated
# pylint: disable=too-many-locals | ||
@click.command() | ||
@cli.command(name="cmd-mode") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just use cmd
and daemon
as the subcommands?
pkg/utils.py
Outdated
@@ -0,0 +1,328 @@ | |||
# pylint: disable=cyclic-import |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this entire file needs a rebase.
pkg/runTest.py
Outdated
result, test, output=kwargs["output_format"], matcher=match | ||
) | ||
result_output[testname] = result_data | ||
del match |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this del match
automatically done as a part of GC?
pkg/utils.py
Outdated
Returns: | ||
_type_: index and uuids | ||
""" | ||
if metadata["benchmark.keyword"] == "k8s-netperf": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one might grow in future. Lets have dict object defined for this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good except for some concerns. Let take the open PRs in order.
585aa1b
to
1ef2b92
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File "/home/jtaleric/code/orion/test/bin/orion", line 33, in <module>
sys.exit(load_entry_point('orion==1.0', 'console_scripts', 'orion')())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jtaleric/code/orion/test/bin/orion", line 25, in importlib_load_entry_point
return next(matches).load()
^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/importlib/metadata/__init__.py", line 202, in load
module = import_module(match.group('module'))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 940, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "/home/jtaleric/code/orion/test/lib64/python3.11/site-packages/orion.py", line 11, in <module>
from pkg.runTest import run
File "/home/jtaleric/code/orion/test/lib64/python3.11/site-packages/pkg/runTest.py", line 8, in <module>
from pkg.utils import run_hunter_analyze, load_config, get_es_url, process_test
File "/home/jtaleric/code/orion/test/lib64/python3.11/site-packages/pkg/utils.py", line 19, in <module>
import pyshorteners
ModuleNotFoundError: No module named 'pyshorteners'
Is this expected? This is newly introduced with this PR as I am able to install the main branch w/o issue.
I added pyshorteners to shorten the buildUrl in the display, the requirements.txt file is not updated since I wanted to update it along fmatch 0.0.6 when released. There would be some function changes in fmatch 0.0.6. |
@shashank-boyapally can you let us know when this is ready for folks to test? |
Hi @jtaleric, the feature is ready to be tested. I updated the requirements with the latest version of fmatch and pyshorteners. I request kindly to please test and let me know any feedback. |
ack - I just saw conflicts... |
5e1ca0e
to
5087863
Compare
README.md
Outdated
|
||
Example | ||
``` | ||
curl -X POST 'http://127.0.0.1:8000/daemon?uuid=4cb3efec-609a-4ac5-985d-4cbbcbb11625' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, I think preferably the endpoints the users hit are more straightforward.
This approach expects the caller to provide the uuid and config. I think this is one option for the daemon, but it leaves the opportunity for error.
I think a more opinionated approach should be how we initially deploy. For example
http://orion.perflab:8000/?payload=true&version=4.16
this endpoint will then use the payload data we have in our data-warehouse, since that will have the most job runs. We could even drop the payload
and assume it is set to true by default.
I think the initial endpoint you are suggesting is something else we can consider (with some adjustments) for more custom change-detection -- however I don't think the http endpoint needs to be like the CLI. For example, we shouldn't have the user post the config via http, I think we have candid configs the user provides are daemon launch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't dug into why this is happening, but with our main branch, I run with the cmd
$ orion --config config.yaml --hunter-analyze
And I get the expected output.
However with this branch, I run the exact same config/test and see
$ orion cmd --config config.yaml --hunter-analyze
File "/home/jtaleric/code/orion/test/bin/orion", line 8, in <module>
sys.exit(cli())
^^^^^
File "/home/jtaleric/code/orion/test/lib64/python3.11/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jtaleric/code/orion/test/lib64/python3.11/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/home/jtaleric/code/orion/test/lib64/python3.11/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jtaleric/code/orion/test/lib64/python3.11/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jtaleric/code/orion/test/lib64/python3.11/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jtaleric/code/orion/test/lib64/python3.11/site-packages/orion.py", line 55, in cmd_analysis
output = run(**kwargs)
^^^^^^^^^^^^^
File "/home/jtaleric/code/orion/test/lib64/python3.11/site-packages/pkg/runTest.py", line 31, in run
result = process_test(
^^^^^^^^^^^^^
File "/home/jtaleric/code/orion/test/lib64/python3.11/site-packages/pkg/utils.py", line 263, in process_test
uuids = [run["uuid"] for run in runs]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jtaleric/code/orion/test/lib64/python3.11/site-packages/pkg/utils.py", line 263, in <listcomp>
uuids = [run["uuid"] for run in runs]
~~~^^^^^^^^
TypeError: string indices must be integers, not 'str'
It appears that we expect
|
I cleaned up my venv and now I get output... however.. New issue.
and with the main branch
|
I chatted with @shashank-boyapally on Slack. Here is the current thoughts -
|
+1 on these ideas. To add on top of it just so that we don't loose track, Hunter also has this feature of having tests divided into groups and then compare between them for regressions. That would also be a good use case to add later when we get to that point. |
@jtaleric @shashank-boyapally these changes sound good to me, one question is is there a way to set a timeframe that if we have a job from say January that shows a regression, should we continue to report that issue with every run or should we set a time period (last 2 weeks) or number of runs back (last 10 runs) to limit re-reporting regressions? |
Hunter also has a parameter where we can specify the timestamp field to look at data regressions since a starting point. Ahh, I forgot to mention earlier, it would be a good addition too. |
Hi Paige, my opinion on this is we should have the previous regression showing in both cmd mode and daemon, the service consuming the api should be able to filter it out based on the timestamp if needed. My take on having the whole regressions based upon the timeline each version of openshift is tested ~ 6 months. One thing we can do is hunter has a timestamp filter which can ignore previous runs before that if we want to have that functionality. |
I'm thinking in terms of orion running in a CI where we might only want to know if the latest/current run is showing a regression from previous runs and fail the job if a regression is detected. I think the timestamping from hunter would be helpful with that as both @vishnuchalla and @shashank-boyapally mentioned. Still a ways out from getting this into a CI though. Just a thought. I'll open an issue to track this option |
I added verify certs as an argument so that it can support the new es instances, the PR can be merged once we have fmatch 0.0.7 |
As per our discussion offline, please break down this PR into atomic commits. Thanks |
Signed-off-by: Shashank Reddy Boyapally <[email protected]>
Signed-off-by: Shashank Reddy Boyapally <[email protected]>
Signed-off-by: Shashank Reddy Boyapally <[email protected]>
Signed-off-by: Shashank Reddy Boyapally <[email protected]>
Signed-off-by: Shashank Reddy Boyapally <[email protected]>
Signed-off-by: Shashank Reddy Boyapally <[email protected]>
Signed-off-by: Shashank Reddy Boyapally <[email protected]>
Signed-off-by: Shashank Reddy Boyapally <[email protected]>
…ode, rebased, fixed trivial bugs Signed-off-by: Shashank Reddy Boyapally <[email protected]>
e1ca7c3
to
8e1b95b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. Thanks
Signed-off-by: Shashank Reddy Boyapally <[email protected]>
workerNodesCount: 6 | ||
infraNodesCount: 3 | ||
benchmark.keyword: node-density-cni | ||
ocpVersion: {{ version }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we maybe put in the title template here so that a user doesn't take this file and just try to run as this version is not filled in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes sense, will add a comment mentioning it a template.
Signed-off-by: Shashank Reddy Boyapally <[email protected]>
@@ -0,0 +1,59 @@ | |||
# This is a template file | |||
tests : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hit a small issue with the tiny url timing out when trying to run the below, increasing the timeout I was able to successfully run the command
orion cmd --config examples/small-scale-cluster-density.yaml
`shortener = pyshorteners.Shortener(timeout=10)`
Signed-off-by: Shashank Reddy Boyapally <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Type of change
Description
Do not MERGE until fmatch 0.0.6 is released
Firstly, please forgive for changing many things in a single PR. Below are the updates following this PR.
orion daemon
. This is the opinionated version of daemon mode, where the tests are pre-determined withsmall-scale-cluster-density
being default.Get list of options of tests available using
cmd-mode
run it usingorion cmd
daemon-mode
.Related Tickets & Documents
Checklist before requesting a review
Testing