-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed approach for OSEP WG #3
base: main
Are you sure you want to change the base?
Conversation
89545d9
to
8b52c34
Compare
ddd2fe5
to
a1bf238
Compare
Signed-off-by: Paul Albertella <paul.albertella@codethink.co.uk>
Signed-off-by: Paul Albertella <paul.albertella@codethink.co.uk>
a1bf238
to
b992712
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments from the safety arch wg (07 Jun)
- OS-level system conditions that *may* lead to these losses | ||
* System-level constraints | ||
- Criteria that must be satisfied to *prevent* or *mitigate* hazards | ||
- May be a simple inversion of the hazard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feedback from Safety Arch WG: here we are missing product constraints imposed by other non-FuSa requirements. It would be good to mention for sake of completeness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I will add this.
- i.e. lead to harm in a safety-related system | ||
* Hazards | ||
- OS-level system conditions that *may* lead to these losses | ||
* System-level constraints |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feedback from the Safety Arch WG:
- the STPA handbook says that from the system level analysis we should derive safety requirements for the system (in our case I guess for the OS); in this flow we do not mention this anywhere
- the STPA flow (from the handbook), for the whole process of hierarchical STPA iterations, it practically substitutes "safety requirements" with "system constraints"; this is done till the very last iteration and only at that stage we can define "safety requirements". The nomenclature of this process is not really fit for a hierarchical breakdown of a SW element (so maybe it would be better to rename "system-level constraints" with "safety requirements")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Agreed. The system-level constraints represent the top-level safety requirements / safety goals. I will add this.
- The intended purpose of using STPA is to derive more detailed safety requirements from these high level constraints, by defining controller constraints (which specify the criteria for avoiding UCAs) and Loss Scenarios (which can be used in combination with Controller Constraints to define test cases, including fault injection test cases)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WRT to point 2 I understand the goal of STPA as described in the handbook however if we want to use an iterative approach we need to define new constraints and new controllers at each iteration inside the Kernel. In doing so I find it a bit confusing to use the term "constraints" instead of "SW Safety Requirements and AoUs"....
approach.md
Outdated
* Find supporting evidence (design, tests, processes) for these | ||
* Identify responsibilities of components (kernel, compiler, etc) | ||
|
||
c) Perform hazard and risk analysis using STPA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feedback from Safety Arch WG: b) and c) should be flipped as first we provide an architectural description, then we evaluate hazards and finally for each hazard we look for possible countermeasures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might make more sense, but this will be an iterative process.
For b) I was thinking that we might include measures or mitigations that we believe Linux already provides, but it may be clearer to omit these on the first pass of analysis and then add them in for a second pass, to address gaps identified.
approach.md
Outdated
|
||
b) Identify and document system measures and mitigations for Linux | ||
|
||
*Out of scope for OSEP - LFSCS group?* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From Gab: Instead of "out of scope" also on this bullet I'd say "In collaboration with LFSCS and Safety Architecture WGs?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
- Functional abstraction of system elements | ||
- e.g. Kernel or subsystems, complier, other tools, etc | ||
- Define interactions as control actions and feedback | ||
- Identify Unsafe Control Actions and Loss Scenarios |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from Gab: I guess unsafe control actions are those that are associated to hazards, hence leading to loss scenarios, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, UCAs are control actions that, for a particular set of system conditions, can lead to one or more hazards. Loss scenarios typically describe the causal factors that lead to UCAs, but may also describe other scenarios (not associated with a Control Action) that can result in hazards.
- Define interactions as control actions and feedback | ||
- Identify Unsafe Control Actions and Loss Scenarios | ||
- Define Controller Constraints | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From Safety Arch WG: At this stage I think that we would have some controllers with unsafe control actions for which we do not have architectural mitigation; at this stage there are two options:
a) provide a more detailed architectural description of the controller by partitioning it into multiple controllers and allocate each of the controller with safety requirements (constraints if we want to use the STPA terminology), then we iterate back to step b)
b) the single controller complexity and/or architectural description is "acceptable" for considering it as an elementary SW design element and hence we can move to step d) (TBD: to discuss about acceptable complexity and/or architectural criteria)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breaking a controller (or a controlled process) down into sub-components, or performing a new analysis at a more detailed level of abstraction, should certainly be an option. What I'd like to discuss in the WG at some point is when we need to do this. Your suggested trigger (we can't identify a mitigation) is only one example, in my opinion; another might be that we have identified a causal factor for a UCA during Loss Scenario definition, which reveals controllers or controlled processes that are involved at a lower level of granularity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO one of the stop criteria is when all UCAs are fully mitigated by safety mechanisms external to the controller or by AoUs on the controller. If that is not the case then we need to rely on the systematic capability of the controller to be high enough to claim that it is reasonably free of UCAs. And here the problem is about 'acceptable complexity criteria' for the controller...
- Identify or provide other evidence to support claims | ||
- e.g. Quality criteria | ||
* Find supporting evidence from FOSS communities | ||
- Formal, verifiable process or inconsistent practice? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the Safety Arch WG: in view of an iterative approach we should highlight that missing to achieve the required process evidences to claim the systematic capability of the controller could also result in a further partitioning of the controller being required
approach.md
Outdated
* Find supporting evidence from FOSS communities | ||
- Formal, verifiable process or inconsistent practice? | ||
|
||
e) Identify and document claims and use cases |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the Safety Arch WG: OPEN - not clear if there is any condition during this phase that could lead to iterate back on previous phases.
- Based on Eli Hibshoosh's slides Signed-off-by: Paul Albertella <paul.albertella@codethink.co.uk>
- Base don review comments from Safety Architecture WG Signed-off-by: Paul Albertella <paul.albertella@codethink.co.uk>
Goal: 'Composable' safety analysis approach for Linux
Objectives: