Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed approach for OSEP WG #3

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

reiterative
Copy link
Collaborator

Goal: 'Composable' safety analysis approach for Linux
Objectives:

  • Select topics for analysis and investigation
  • Define activities per topic: iterative, potentially in parallel
  • Coordinate activities between contributors & WGs (and external?)
  • Enable combination and re-use of activity outputs

Signed-off-by: Paul Albertella <paul.albertella@codethink.co.uk>
Signed-off-by: Paul Albertella <paul.albertella@codethink.co.uk>
Copy link
Contributor

@paolonig paolonig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments from the safety arch wg (07 Jun)

- OS-level system conditions that *may* lead to these losses
* System-level constraints
- Criteria that must be satisfied to *prevent* or *mitigate* hazards
- May be a simple inversion of the hazard
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feedback from Safety Arch WG: here we are missing product constraints imposed by other non-FuSa requirements. It would be good to mention for sake of completeness

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I will add this.

- i.e. lead to harm in a safety-related system
* Hazards
- OS-level system conditions that *may* lead to these losses
* System-level constraints
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feedback from the Safety Arch WG:

  1. the STPA handbook says that from the system level analysis we should derive safety requirements for the system (in our case I guess for the OS); in this flow we do not mention this anywhere
  2. the STPA flow (from the handbook), for the whole process of hierarchical STPA iterations, it practically substitutes "safety requirements" with "system constraints"; this is done till the very last iteration and only at that stage we can define "safety requirements". The nomenclature of this process is not really fit for a hierarchical breakdown of a SW element (so maybe it would be better to rename "system-level constraints" with "safety requirements")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Agreed. The system-level constraints represent the top-level safety requirements / safety goals. I will add this.
  2. The intended purpose of using STPA is to derive more detailed safety requirements from these high level constraints, by defining controller constraints (which specify the criteria for avoiding UCAs) and Loss Scenarios (which can be used in combination with Controller Constraints to define test cases, including fault injection test cases)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WRT to point 2 I understand the goal of STPA as described in the handbook however if we want to use an iterative approach we need to define new constraints and new controllers at each iteration inside the Kernel. In doing so I find it a bit confusing to use the term "constraints" instead of "SW Safety Requirements and AoUs"....

approach.md Outdated
* Find supporting evidence (design, tests, processes) for these
* Identify responsibilities of components (kernel, compiler, etc)

c) Perform hazard and risk analysis using STPA
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feedback from Safety Arch WG: b) and c) should be flipped as first we provide an architectural description, then we evaluate hazards and finally for each hazard we look for possible countermeasures

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might make more sense, but this will be an iterative process.

For b) I was thinking that we might include measures or mitigations that we believe Linux already provides, but it may be clearer to omit these on the first pass of analysis and then add them in for a second pass, to address gaps identified.

approach.md Outdated

b) Identify and document system measures and mitigations for Linux

*Out of scope for OSEP - LFSCS group?*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Gab: Instead of "out of scope" also on this bullet I'd say "In collaboration with LFSCS and Safety Architecture WGs?"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

- Functional abstraction of system elements
- e.g. Kernel or subsystems, complier, other tools, etc
- Define interactions as control actions and feedback
- Identify Unsafe Control Actions and Loss Scenarios
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from Gab: I guess unsafe control actions are those that are associated to hazards, hence leading to loss scenarios, correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, UCAs are control actions that, for a particular set of system conditions, can lead to one or more hazards. Loss scenarios typically describe the causal factors that lead to UCAs, but may also describe other scenarios (not associated with a Control Action) that can result in hazards.

- Define interactions as control actions and feedback
- Identify Unsafe Control Actions and Loss Scenarios
- Define Controller Constraints

Copy link
Contributor

@paolonig paolonig Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Safety Arch WG: At this stage I think that we would have some controllers with unsafe control actions for which we do not have architectural mitigation; at this stage there are two options:
a) provide a more detailed architectural description of the controller by partitioning it into multiple controllers and allocate each of the controller with safety requirements (constraints if we want to use the STPA terminology), then we iterate back to step b)
b) the single controller complexity and/or architectural description is "acceptable" for considering it as an elementary SW design element and hence we can move to step d) (TBD: to discuss about acceptable complexity and/or architectural criteria)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking a controller (or a controlled process) down into sub-components, or performing a new analysis at a more detailed level of abstraction, should certainly be an option. What I'd like to discuss in the WG at some point is when we need to do this. Your suggested trigger (we can't identify a mitigation) is only one example, in my opinion; another might be that we have identified a causal factor for a UCA during Loss Scenario definition, which reveals controllers or controlled processes that are involved at a lower level of granularity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO one of the stop criteria is when all UCAs are fully mitigated by safety mechanisms external to the controller or by AoUs on the controller. If that is not the case then we need to rely on the systematic capability of the controller to be high enough to claim that it is reasonably free of UCAs. And here the problem is about 'acceptable complexity criteria' for the controller...

- Identify or provide other evidence to support claims
- e.g. Quality criteria
* Find supporting evidence from FOSS communities
- Formal, verifiable process or inconsistent practice?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the Safety Arch WG: in view of an iterative approach we should highlight that missing to achieve the required process evidences to claim the systematic capability of the controller could also result in a further partitioning of the controller being required

approach.md Outdated
* Find supporting evidence from FOSS communities
- Formal, verifiable process or inconsistent practice?

e) Identify and document claims and use cases
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the Safety Arch WG: OPEN - not clear if there is any condition during this phase that could lead to iterate back on previous phases.

- Based on Eli Hibshoosh's slides

Signed-off-by: Paul Albertella <paul.albertella@codethink.co.uk>
- Base don review comments from Safety Architecture WG

Signed-off-by: Paul Albertella <paul.albertella@codethink.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants