-
Notifications
You must be signed in to change notification settings - Fork 27
feat: add problog rule inference to build as code check #260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: staging
Are you sure you want to change the base?
feat: add problog rule inference to build as code check #260
Conversation
Signed-off-by: sophie-bates <[email protected]>
…y values using ProbLog. Signed-off-by: sophie-bates <[email protected]>
…ugh problog inference Signed-off-by: sophie-bates <[email protected]>
…sed subcheck Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
print(key, value) | ||
if str(key) == "build_as_code_check": | ||
confidence_score = float(value) | ||
results = vars(build_as_code_subchecks.build_as_code_subcheck_results) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's okay to use this built in function for now. But we should keep in mind that this function would raise TypeError when __dict__
attribute could not be found within the target object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comment Nhan, I'm actually about to push a change that will mean I'm no longer using this anyway so hopefully it's all good.
self.check_results: dict = {} # Update this with each check. | ||
self.ci_info = ci_info | ||
self.ci_service = ci_info["service"] | ||
self.failed_check = 0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit confused by this attribute failed_check
. Could you document what this attribute is used for and what is the value that it contains?
From a first look, I was expecting a boolean
type instead of a float. Is this attribute the value that we should return if a sub check fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's the value that should be returned if the sub check fails.
check_certainty = 1.0 | ||
# If this check has already been run on this repo, return certainty. | ||
|
||
justification: list[str | dict[str, str]] = ["The CI workflow files for this CI service are parsed."] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could define the justification inside the if branch so that we don't need to do it when self.ci_info["bash_commands"]
is not available.
…ermine which results to store Signed-off-by: sophie-bates <[email protected]>
# TODO: Investigate using proofs | ||
|
||
# Check whether the confidence score is greater than the minimum threshold for this check. | ||
if confidence_score >= self.confidence_score_threshold: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this line, if a CI service has a confidence_score
larger than the threshold, but smaller than other CI services, wouldn't the check return the result for the smaller confidence_score
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. I can update this implementation to store all of the required information (i.e. deploy commands) for each ci_service
and then populate the BuilasAsCodeTable
and AnalyzeContext
with whichever ci_service
has the highest confidence_score
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, however I don't understand why AnalyzeContext
needs to be updated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just meant where the ci_info
information is updated, i.e. here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's the inferred provenance representation. We don't update ci_info
itself.
Signed-off-by: sophie-bates <[email protected]>
…lgraph Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
…n sub-checks Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
…workflow timestamp Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
… deployment method Signed-off-by: sophie-bates <[email protected]>
… for trigger event type sub-task Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
… for particular event types Signed-off-by: sophie-bates <[email protected]>
Priority (v1)
pyproject.toml
with ProbLog dependency.run_check
method into smaller subchecks.ci_service
and compare confidence_scores to produce final result. Currently just using the first CI service.setup.py
file. Timestamp check requires the project name. This is straightforward to fetch for projects where it's specified in thepyproject.toml
, but less so for those that usesetup.py
. I developed an initial implementation forsetup.py
but later removed it because I realised it was only applicable for very specified scenarios. The other option was to executesetup.py
to fetch the name, but for obvious reasons we shouldn't do that.Subchecks
github_actions
function to fetch the 5 most recent runs of a workflow file.Note: for many of the sub-checks we've had to do a check for deploy action method vs deploy command as there can be multiple methods used in a workflow. Need to figure out a more streamlined way to handle this.
Later
tox -e release
as a deploy command, howeverrelease
is just a target pointing to a section in atox.ini
file, so potentially need to investigate parsing these files so verify that it contains deploy commands.Notes: