Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux Kernel Safety First Principles #36

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

igor-stoppa
Copy link
Collaborator

No description provided.

Copy link
Collaborator

@dweingaertner dweingaertner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if you are still working on the text, but here are my suggestion after your last comments. Hope they help

@igor-stoppa igor-stoppa force-pushed the main branch 5 times, most recently from cfe198a to 40f0ea1 Compare May 24, 2024 15:18
@igor-stoppa igor-stoppa force-pushed the main branch 2 times, most recently from 8010ac0 to d8321aa Compare May 30, 2024 15:52
@igor-stoppa igor-stoppa changed the title Example of system decomposition System decomposition + Memory Management Essentials + First Principles May 30, 2024
@igor-stoppa
Copy link
Collaborator Author

@reiterative the PR title is updated to better reflect its content. As I wrote also in a mail the First PRinciples statments need the Memory Management Essentials to be fully justified, so I grouped them together.
I wrote to Pete asking if he agrees to be mntioned as co-author, but I have not yet added him, till I get permission - and his email address.

@igor-stoppa
Copy link
Collaborator Author

Added Pete to the list of co-authors

1. The vanilla Linux Kernel, is not sufficient to support, alone, safey goals.
2. No internal kernel barriers/protections against interference to variable data, including safety-relevant data.
3. Any component within the kernel context can generate (cascaded) interference.
4. Internal interference can cascade also through components that have been qualified for safety at unit-level.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove "at unit-level" as the concept of unit, usually, applies to newly developed SW, not when qualifying pre-existing one

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit is meant to represent the granularity chosen for the qualification, whatver that might be.
I'm trying to say SeooC without using the terminology, but that is the intended meaning.
As long as he meaning is preserved, I'm open to re-formulating it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
4. Internal interference can cascade also through components that have been qualified for safety at unit-level.
4. Internal interference can also cascade through components that have only been safety-qualified at a component level.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reiterative I'm almost ok with this, but I do not like that "only" because it seems to imply that something can be done on components to prevent the cascading of interference.
In theory yes, you could add redundance and what not, but in practice I don't think anyone will be able to make components sufficiently hardened.
If that is what you are after, then let's spell it out. I want to paint an accurate picture of the problem, without closing avenues that are open, but I do not want to present a rocky mountain path as if it was a boulevard either :-D

3. Any component within the kernel context can generate (cascaded) interference.
4. Internal interference can cascade also through components that have been qualified for safety at unit-level.
5. The Linux Kernel is able to interfere with both itself and any part of user-space processes (linear map).
6. No generic solution for very complex systematic corruption of any writable memory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here what do you mean by "generic solution"? Do you mean "currently there is no design solution to protect writable safety relevant memory"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Currently" is implied, but, again, as long as something is proven to work, it's an absolute statement.
It can be rephrased, if it helps with readability, without altering the meaning.
My point is that the "Very complex systematic corruption" cannot be detected through solutions that are applicable to - let's say - the memory map of a safety-relevant process, that might comprise tens, hundreds of MB of memory. You can certainly try to use ad-hoc verification, e.g. performing functional testing on that process, but it needs to be tailored to each individual process. No generic solution. Only a generic methodology, to be taylored.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can make a stronger (but narrower) statement here:

Suggested change
6. No generic solution for very complex systematic corruption of any writable memory.
6. Complex systematic corruption of writable memory cannot be prevented by a generic, kernel-level solution

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reiterative: "neither prevented nor detected" is the right formualtion, I think

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@igor-stoppa I think your statement is not correct. For example CONFIG_STACKPROTECTOR_STRONG or KASAN are generic Kernel solution that provide detections of writable memory. So I would stick with Pauls' rewording

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paolonig They are partial solutions. Placing canaries on the stack doesn't catch all the possible corruptions.
KASAN alters the system configuration, and anyways is not deployed.
Feel free to provide evidence countering my arguments, they are intentionally made in a very clan cut way, to allow easy confutation by evidence, if any can be found.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes they are partial generic kernel level solutions, I agree. However an integrator or a SW vendor may use such partial solution in combination with other measures to support a non-interference claim.
So IMO the fact is the monolithic nature of the Kernel that does not allow, today, interference prevention (like the one you can have by SMMU for a user space process) and we can all agree on this.
What I cannot agree is about principles that also imply or deny specific ways to support a certain claim; since the specific ways depend on the specification of the integration context and of other projects aspects (depending on the engineering workforce I may decide to use some verification strategies rather than others....)

9. Those security features which are based on randomisation decrease repeatability of testing (e.g. structure layout randomisation).
10. Safety claims must be supported by components with the same-or-better safety level (e.g. a safety-qualified watchdog).
11. Non safety qualified processes can interfere with safety qualified ones, indirectly, through the kernel (e.g. triggering memory management bugs).
12. Using cgroups/containers/SELinux removes only certain simple types of user-space-induced interference.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove "simple" as it is too much debatable IMO

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simple -> direct ?

This section provides touchstone concepts about (lack of) system availability, in presence of non-safety-qualified components.

1. Detecting interference alone doesn't help with controlling/managing availability.
2. The nature of the very complex systematic interference makes estimating probability of a failure unrealistic.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree on this point. For instance even if an external safety monitor is required to meet the FDTI claim, it is still possible to tune the system and stress test it to claim that the unavailability risk due to temporal interference is acceptable

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to estimate in a verifiable way said probability, then.
We can review the soundness of the estimation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do that only if you define what is a good-enough RT performance. BTW you can start from this paper and the toolset mentioned within it as a good starting point.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not talking about RT performance, at least not in the time scale commonly intendend.
I'm talking about minutes (or more) overheads to basic operations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make an example of a basic operation and also explaining what is preventing it from running?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine a set of processes implementing machine vision using tens of GB or ram.
Try to monitor that in realtime for corruption caused by the kernel ...


1. Detecting interference alone doesn't help with controlling/managing availability.
2. The nature of the very complex systematic interference makes estimating probability of a failure unrealistic.
3. Stress testing as a means of safety qualification might be realistic at most for very simple cases.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But here the topic is availability, not safety qualification....

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safety requirements cover both FFI and availability, for higher ASIL. No?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the use cases I worked on did not require the OS to be safe operational. I.e. panicing the OS would lead the system in an acceptable safe state (with a very upset driver, hence the need for availability, that however is not related to safety)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then just keep the system off :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jokes apart I think that in order to agree to such principles we need first to define the integration context.
As I mentioned all the customers I worked with usually accept to have an availability test campaign to be separate from safety (since safety is guaranteed by the external monitor in case of missing the FDTI deadline)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These principles are meant to not be democratically decided by what anyone customer's might think.
I understand "the customer is always right" approach, but that doesn't belong to this document.

Anyone is free to take the risks they prefer, small or large.

This document is not about that. It's about separating assessments that can be done with very high confidence, if not certainty, from others that are based on subjective analysis.

For example, can any of your customers prove that they have determined the coverage of that stress testing?

Performing the same or similar operation billions of times doesn't say anything about other scenario that might arise.
It can be done for sure in simple scenarios, but in more complex ones it cannot.
Again, I'm tryin to make it easy to prove me wrong, by making fairly clean cut statements.

Just add evidence ...

5. evaluation of Linux Security Module hooks
6. evaluation of cgroups tests
7. presence of non-safety-relevant triggering many of the above points

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above points are correct, and this is why a QM baseline is always required to support availability, but the same baseline applies to NSR components....so what is the goal here? Wouldn't it be easier to say "a QMS baseline is required to meet the products' availability requirements" ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is that a QM component is by definition less safe than an ASIL one.
Even in presence of perfect detection of interference QM->ASIL, the more QM components can interfere with the ASIL one, the lower the MTBF.
A QM baseline might not be enough to meet availability requirements.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but you are wrong here. A detected interference is not a failure (for fail safe systems) of the system and hence it does not contribute to the final MTBF target. For safe operational then the story is different as detecting the interference is not a good-enough mechanism (but then you need to disambiguate this clearly as usually availability is considered separate from safety )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A detected interference tells you that an interference happened, but it doesn't tell you that the system is still safe.
It tells you that something is generating interference. And unless you have an ad hoc way od doing more than that, the only solution (in a non redundant system) is to put it into safe mode. That constitutes in practice a functional failure and it lowers the MTBF.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to reword it, for improved clarity , but the meaning is what I wrote above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above it all depends on your system, on your integration context.
If for the integrator it is safety critical to avoid interference, since the only detection would still lead to a hazard, then you're right.
If instead detecting interference triggers the safe state that relies on a backup system, for example, pulling over the car, then you're not right.
So I think we should better clarify the assumptions on the integration context...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backup system is out of the scope of this document.
You can very well consider it as confirmation of my point.
This is talking about availability of the intended functionality of the "product", but let's agree it to be a car.
A pull over is a failure. It's not a safety failure, certainly. But it IS a failure of the product.
We can add a clarification about what MTBF means in this context (what I just wrote), if it can help.

Considerations based on the two previous sections.

1. Devise negative testing simulating interference toward safety-relevant components.
2. Rely only on external components/mechanisms that already qualified at same-o-better safety level.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than "external" I would say "independent"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you name an independent internal component?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The combination of a Runtime Verification Monitor and an external WTD can be used as an independent monitor for example.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the external WD is not internal :-D

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyways the RVM out of the box doesn't provide that functionality, you have to feed it the models

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes of course, you need to design your specific RVM. It was just an example to justify independent as a better word compared to external....

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ehh. Perhaps it's splitting the hair, but it is not independent. It uses library code from the kernel itself.


1. Devise negative testing simulating interference toward safety-relevant components.
2. Rely only on external components/mechanisms that already qualified at same-o-better safety level.
3. Analyse effects of interference based on recipient (e.g. corrupted data harder to notice)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I would say "Make sure that the interference modes are avoided or detected either at the source or at the recipient side with sufficient level of rigor"

Copy link
Collaborator Author

@igor-stoppa igor-stoppa Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's intentionally excluding the source because the whole point is that it's impossible to claim completeness about sources.

Again, if you disagree, feel free to prove me wrong and enumerate all the possible sources.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CONFIG_STACKPROTECTOR_STRONG is a mechanism that can detect interference both at the source (the stack frame of the function where interference is generated) as well as at the recipient (the stack frame of a function where the interference trespasses the canaries).
KASAN is a mechanism that detects interference at the source.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stack protector adds a canary in certain places. even adding canaries to all the similar places, interference that doesn't kill a canary doesn't get noticed

again, feel free to prove me wrong (maybe I should find an acronym for this, considering how many times I've written it)

wrt KASAN, just to give an example, it can detect out of bounds errors. but if there is an error that doesn't systematically cross boundaries, it will not necessarily be detected.
that is just an easy example, on top of my head ...

3. Analyse effects of interference based on recipient (e.g. corrupted data harder to notice)
4. When possible, leverage system configuration (e.g. external safety island / external monitor)
5. Pick adequate granularity for verification (e.g. fine grained component validation vs end-to-end)
6. In case of end-to-end verification, choose strategy: periodic test injection, or on demand.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO point 6 is about "Make sure that mechanisms used to prevent or detect interference are properly verified. In case of HW mechanisms periodic or on-demand diagnostic tests may be required"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not about HW - not necessarily

it's about using QM or less SW while performing E2E verification that it's still functioning correctly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then rather than e2e verification it seems an actual ASIL decomposition (and in this case we need to make sure that an appropriate safety analysis is done on the QM element to make sure the ASIL monitors cover all the dangerous failure modes)....

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if one end can make asusmptions about the other end, and validate them for correctness, isn't it e2e?

Copy link
Collaborator

@dweingaertner dweingaertner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the main issues being discussed is the ability the kernel has to change any part of memory. Does someone know how such problem is dealt with in other OS which have high SIL certifications? For example, when using a hypervisor? Is the argumentation based on the simplicity of the hypervisor`s code and expert knowledge evaluating it?

@igor-stoppa
Copy link
Collaborator Author

I can't reply directly to @dweingaertner 's question, but here it is:

One of the main issues being discussed is the ability the kernel has to change any part of memory. Does someone know how such problem is dealt with in other OS which have high SIL certifications? For example, when using a hypervisor? Is the argumentation based on the simplicity of the hypervisor`s code and expert knowledge evaluating it?

They are usually designed from ground-up to not be exposed so much to this problem.
Linux is monolithic, others are microkernels. A typical device driver is not allowed write access to kernel vital data, in a microkernel. But in Linux it can, and that it one problem. Having in-kernel barriers, then it becomes easier to qualify one the relatively small portion of code that has high privilege.
And usually the develoment follows different processes, and there is more control on what gets accepted and why and how.

xen and zephyr are perhaps the best examples of FOSS that is developed in a safety-friendly way.

Regarding the hypervisor:
The hypervisor knows about an OS as much as the OS knows about one of the preocesses it runs.
Very little. And intentionally so.

Furthermore, notice that for an hypervisor to realistically poke its nose in the kernel and help with integrity, either the kernel acts cooepratively, abstracting its own data structures through specificly crafted hypervisor calls, or the hypervisor needs to have knowledge of the kernel internals, using its headers and probably libraries.
Many wnat to have a closed source hypervisor and that would not be possible, given the contagious nature of the GPLv2 used by the kernel.

These are just a few reasons, but they should prove to be already sufficient :-)

igor-stoppa and others added 7 commits July 12, 2024 21:39
Co-authored-by: Eli Gurvitz <[email protected]>
Co-authored-by: Luigi Pellecchia <[email protected]>
Co-authored-by: Michal Szczepankiewicz <[email protected]>
Co-authored-by: Paul Albertella <[email protected]>
Co-authored-by: Peter Brink <[email protected]>
Co-authored-by: Sanjay Trivedi <[email protected]>
Signed-off-by: Igor Stoppa <[email protected]>
Co-authored-by: Paul Albertella <[email protected]>
Signed-off-by: Igor Stoppa <[email protected]>
Co-authored-by: Paul Albertella <[email protected]>
Signed-off-by: Igor Stoppa <[email protected]>
Co-authored-by: Paul Albertella <[email protected]>
Signed-off-by: Igor Stoppa <[email protected]>
Co-authored-by: Paul Albertella <[email protected]>
Signed-off-by: Igor Stoppa <[email protected]>
Co-authored-by: Paul Albertella <[email protected]>
Signed-off-by: Igor Stoppa <[email protected]>
Co-authored-by: Paul Albertella <[email protected]>
Signed-off-by: Igor Stoppa <[email protected]>
@igor-stoppa igor-stoppa changed the title System decomposition + Memory Management Essentials + First Principles Linux Kernel Safety First Principles Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants