Architecture/Performance/Caching/caching.theory.txt


                                  ┏━━━━━━━━━━━━━┓
                                  ┃   CACHING   ┃
                                  ┗━━━━━━━━━━━━━┛

CACHING ==>                       #Refactoring information used several times to a single time
                                  #I.e. information reuse when high time reference of locality
                                  #Usually conceptually a hash table with function input as key
                                  #Cache hit|miss:
                                  #  - whether a specific function invocation can use cache
                                  #  - hit ratio: percentage of cache hits
                                  #  - "trashing":
                                  #     - very low hit ratio, i.e. time spent updating cache > time saved by caching
                                  #     - is when locality of reference (how much data is reused) < cache max size
                                  #If cache has max size, it requires a replacement policy ("cache algorithm")
                                  #  - tradeoff between hit ratio (fewer cache misses) and latency (how long to process cache hits)
                                  #  - can be:
                                  #     - LRU:
                                  #        - remove Least Recently Used item
                                  #        - i.e. queue
                                  #     - PLRU (Pseudo-LRU)
                                  #        - remove not recently used item (not necessarily least)
                                  #     - MRU:
                                  #        - remove Most Recently Used item
                                  #        - i.e. stack
                                  #        - useful when items are rarely used twice in a row
                                  #     - LFU: remove Least Frequently Used item
                                  #     - Bélády:
                                  #        - remove item that will be used in the furthest future
                                  #        - usually unknown, i.e. can only be approximated
                                  #     - RR (Random Replacement): remove random item
                                  #"Write-back": cache also acting as a buffer on write requests

MEMOIZATION ==>                   #Caching function calls:
                                  #  - by storing a hash table inside that function with input as key, return value as value
                                  #  - return value must be predictable, i.e. must be pure function

MEMORY HIERARCHY ==>              #The higher, the faster:
                                  #  - CPU registers (0.25ns): 8-256 * few bytes
                                  #  - L1 cache (1ns): 128KB
                                  #  - L2 cache (4ns): 1MB
                                  #  - L3 cache (16ns): 6MB
                                  #  - Primary storage (RAM) (100ns): 256MB-64GB
                                  #  - Secondary storage (Disk) (3ms): 1GB-256TB
                                  #  - Tertiarty storage (Remote or offline) (150ms): unlim
                                  #High spatial locality of reference helps because:
                                  #  - items are more likely to use a higher|faster cache
                                  #  - items are fetched in pages or "cache lines", i.e. other items close in memory are fetched too

CPU CACHING ==>                   #"Multi-banked cache":
                                  #  - separate L1 instruction cache and L1 data cache
                                  #  - opposite: "unified"
                                  #"Inclusive cache":
                                  #  - L2 cache includes L1, L3 includes L2
                                  #  - opposite: "exclusive"
                                  #  - can also be "Non-inclusive non-exclusive" (NINE), i.e. each level can be either

CACHE PROXY ==>                   #Can be:
                                  #  - not proxied: data <-> client <-> cache
                                  #     - con: more work for client
                                  #  - proxied: client <-> cache <-> data
                                  #     - con: more coupling with cache
                                  #        - must have same data model
                                  #        - less resilient: cache cannot fail

CACHE WRITE ==>                   #Client can write to:
                                  #  - write-around: data
                                  #     - pro: smaller cache, i.e. good if low hit ratio
                                  #     - con: cache might be inconsistent, behind data
                                  #        - slower reads on miss
                                  #     - called "cache-aside" if not proxied, "read-through" if proxied
                                  #  - write-through: cache + data, waiting on both
                                  #     - pro: cache|data are consistent
                                  #     - con: slower writes
                                  #  - write-back|behind: cache + data, waiting only on cache
                                  #     - con: data might be inconsistent, behind cache
                                  #        - data loss if cache fails
                                  #     - con: slower writes, but not as much

CACHE ATTACK ==>                  #Accessing cached confidential information
                                  #I.e. bypass authentication due to caching
                                  #Prevention:
                                  #  - do not cache
                                  #     - including for HTML autocomplete (see its doc)
                                  #  - cache for shorter time

CACHE POISONING ==>               #Setting malicious cache value, so that other clients that share the same cache use it