Update Linux_Memory_Management_Essentials.md

Signed-off-by: Igor Stoppa <[email protected]>
elisa-tech · May 24, 2024 · cfe198a · cfe198a
1 parent f3ec001
commit cfe198a
Showing 1 changed file with 32 additions and 8 deletions.
diff --git a/Contributions/Linux_Memory_Management_Essentials.md b/Contributions/Linux_Memory_Management_Essentials.md
@@ -131,7 +131,9 @@ The following considerations are of a more deductive nature.
 #### **Assertions**
 The following section presents a set of statements that can be objectively verified e.g. by inspecting the sources.
 
-1. given a user-space logical memory page, at a certain moment, it must not be assumed to have a corresponding backing physical memory pages; it means that a process might have had some of its pages dropped and/or swapped out to disk, or perhaps not even ever loaded.
+1. given a user-space logical memory page, at a certain moment, it must not be assumed to have a corresponding
+   backing physical memory pages; it means that a process might have had some of its pages dropped and/or
+   swapped out to disk, or perhaps not even ever loaded.
 2. it is not possible to make many assumptions about the state of the logical content of a process pages.
    Perhaps one of the few statments that can be made is that the logical page containing code being executed
    in a certain moment has also a physical backing, while it is being executed.
@@ -150,9 +152,21 @@ The following section presents a set of statements that can be objectively verif
       1. constants
       2. the main executable associated with the process
       3. the dynamic linker (optional but typically used for ELF files)
-      4. linked libraries
-4. the kernel uses various optimisations for dealing with processes on-demand mapping:
-   1. a physical page is allocated and mapped to a process only when the process accesses it:
+      4. linked libraries (e.g. glibc)
+4. when the kernel starts a process, it sets up mappings for all the virtual memory areas required to get it started,
+   but it doesnt actually allocate any memory: once the process is scheduled for execution, it will eventually be run 
+   for the first time, and as soon as the various areas are accessed, they will trigger page fault exceptions.
+      1. when an address is accessed for the first time, it might require that a memory page is allocated to host whatever
+         the associated content might be, but it is possible, especially when the process has just been started, that
+         even the associated page in he page table is missing.
+      2. based on both availability of free pages and type of content asosciated, the access might cause the process
+         to sleep; exaples:
+         1. no free pages are available (unusual but possible) and the kernel will have to try to obtain one, in one of
+            the ways described above
+         2. the content needs to be loaded from a file, and the operation is blocking, because the retrieval process is much slower.
+      Either way, it is difficult to know if/when a process is not going to generate any more faults, and it is very much not deterministic.
+6. as also decribed in the previous point, the kernel uses various optimisations for dealing with processes on-demand mapping:
+   1. a physical page is allocated and mapped to a process only when the process accesses it, otherwise it might not be present:
       1. file backed pages are allocated/mapped only when read/written to
       2. anonymous pages are allocated only when written to
    2. zero page: when a page is known to be empty, it is not reserved and mapped; instead,
@@ -162,30 +176,40 @@ The following section presents a set of statements that can be objectively verif
       are mapped as read-only into each process address space
    4. copy-on-write:
       1. when a library has own data, this is initially mapped as read-only and shared;
-         only when written to, then a separate physical page is reserved for eac hwriting process (threads sahre the same)
+         only when written to, then a separate physical page is reserved for each writing process (threads sahre the same)
       2. same happens for data pages that were treated as zero-page, but then are written to.
    5. folios (see also kernel section): data structures that try to better abstract compound pages and *might* also be used
       to represent optimised contiguous pages on ARM64 (instead of mapping 16 entries, it is possible
       to map a 16-pages chunk of physically contiguous memory, that is also aligned to a 16 pages
       boundary, reducing TLB use).
       1. A folio acts as intermediary between vma and lower level memory management
       2. it might pre-allocate/map more pages than explicitly requested
-5. read-ahead: when asked to fetch data from disk, the kernel might attempt to optimise the operation,
+7. read-ahead: when asked to fetch data from disk, the kernel might attempt to optimise the operation,
    reading more pages than requested, under the assumption that more requests might be coming soon
 
 
 #### **Safety-Oriented consideration**
 The following considerations are of a more deductive nature.
 
-1. a process that is supposed to support safety requirements should  not have pages swapped out / dropped,
+1. a process that is supposed to support safety requirements should  not have pages swapped out / dropped / missing,
    because this would introduce:
    1. uncertainty in the timing required to recover the content, if not immediately available
    2. additional risk, involving the userspace paging mechanisms in the fulfilling of the safety requirements
    3. additional dependency on runtime linking, in case the process requires it, and code pages have been
       discarded - reloading them from disk will not be sufficient
-3. The optimisations made by the kernel in providing physical backing to process memory make it very
+2. The optimisations made by the kernel in providing physical backing to process memory make it very
    questionable if it can be assessed when a (part of) a process memory content is actually present in the
    system physical memory.
+3. by default, it is to be expected that a process will be exposed to various types of interference from the kernel:
+   1. some of a more bening nature, like dropping of pages, or not allocation of not-yet-used one
+   2. some limited in extent, but hard or even practicaly impossible to detect, like a rogue write to process physical memory
+   3. some of systemic nature, like some form of use-after free, where a process page is accidentally in use also by another component
+   4. some of indirect nature, like for example when the page table of the process address space is somehow corrupted
+4. again, because of the extremely complex nature of the system, positive testing is not sufficient, but it needs to
+   be paired also with negative testing, proving that it is possible to cope with interference and detect it, somehow.
+5. the same considerations made about integrity vs. avaialbility for the kernel are valid here too: detecting
+   interference doesn't help with keeping it under a certain threshold, and due to the complexity of the system,
+   it is not possible to estimate the risk reliably.