You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It should be relatively straightforward to create a tool that can determine many of the code constructs that are causing a specific tool problems. Using the expected results full details file and yaml file from a generated test suite, and the actual results for a particular tool (from BenchmarkScore), do something like:
Generate a list of every code snippet used to generate that test suite (Straight from the YAML file?).
Create bidirectional data structure like so:
Create a data structure for each code snippet that has links to every test case that uses it.
Create a data structure for each test case that has links to every code snippet used in it.
Pass 1: Go through each True Positive detected by the tool and mark each code snippet used in it as [correctly understood.] (i.e., both or all 3)
Pass 2: Go through each test case and identify any where only 1 of the snippets left is not 'understood' and generate lists for the sources, dataflows, and sinks to focus on.
Sanity Check: See if there are any test cases with all snippets checked, but the tool reports a False Positive.
Once this is working, update tool to automatically calculate this for every actual results file in the /scorecard directory. (i.e., do this for ALL tools)
I was thinking this might require multiple analysis phases, but I think that's it?
Stage 2:
Do something similar to detect False Positive problem areas, but analyzing what they report as TPs (but are FPs).
The text was updated successfully, but these errors were encountered:
It should be relatively straightforward to create a tool that can determine many of the code constructs that are causing a specific tool problems. Using the expected results full details file and yaml file from a generated test suite, and the actual results for a particular tool (from BenchmarkScore), do something like:
Generate a list of every code snippet used to generate that test suite (Straight from the YAML file?).
Create bidirectional data structure like so:
Create a data structure for each code snippet that has links to every test case that uses it.
Create a data structure for each test case that has links to every code snippet used in it.
Pass 1: Go through each True Positive detected by the tool and mark each code snippet used in it as [correctly understood.] (i.e., both or all 3)
Pass 2: Go through each test case and identify any where only 1 of the snippets left is not 'understood' and generate lists for the sources, dataflows, and sinks to focus on.
Sanity Check: See if there are any test cases with all snippets checked, but the tool reports a False Positive.
Once this is working, update tool to automatically calculate this for every actual results file in the /scorecard directory. (i.e., do this for ALL tools)
I was thinking this might require multiple analysis phases, but I think that's it?
Stage 2:
The text was updated successfully, but these errors were encountered: