Architecture/Code_design/Algorithms/sorting.theory.txt


                                  ┏━━━━━━━━━━━━━┓
                                  ┃   SORTING   ┃
                                  ┗━━━━━━━━━━━━━┛

TODO: go through https://en.wikipedia.org/wiki/Category:Sorting_algorithms

                                  ┌───────────────┐
                                  │   QUALITIES   │
                                  └───────────────┘

ADAPTIVE SORTS ==>                #Sorting algorithm that is adaptive, performing better when input is already partially sorted
                                  #("presorted")

STABLE SORTS ==>                  #Whether when two values are equal (according to their sort key), their order in the
                                  #sorted array remains the same as in the initial array.
                                  #Goal:
                                  #  - if initial order mattered
                                  #  - including multi-stage sorting, i.e. previous stage's order matters

EXTERNAL SORTS ==>                #Sorting algorithms with low amount of memory, i.e. incremental.
                                  #Meant when input is bigger than machine's memory.

PARTIAL SORT ==>                  #Sorting only k highest|lowest values
                                  #Opposite: "full sort"
                                  #Average time complexity: O(n log k)
                                  #A generic solution is to partition, then sort only a specific subarray.

APPROXIMATE SORT ==>              #Sorting so that each value is k indices away ("radius") from its actual sorted index.
                                  #Result is "k-sorted sequence"
                                  #  - also named "nearly|roughly-sorted sequence"
                                  #Pros:
                                  #  - faster time complexity
                                  #Cons:
                                  #  - imperfect sort

                                  ┌─────────────┐
                                  │   GENERAL   │
                                  └─────────────┘

COMPARISON SORTS ==>              #Sorting algorithms using a FUNC(VAL, VAL2)-> <= or >=
                                  #  - often use -1, 0 and 1
                                  #Average time complexity at least O(n log n)

COLLATION ==>                     #Defining a total order of a set of values.
                                  #Often for numbers and characters.
                                  #Allows using comparison sorts.

DISTRIBUTION SORTS ==>            #Sorting algorithms using the value distribution instead of a comparison FUNC.
                                  #Average time complexity at least O(n)

TAG ARRAY ==>                     #Instead of sorting an array of values, sort an array of their indices.
                                  #When need to access the value (for comparison purpose), use its index in the tag array.
                                  #Once tag array sorted, replace each index by the original value.
                                  #Whole sort is sometimes called "tag sort".
                                  #Pros:
                                  #  - sort logic moves indices instead of values, i.e. faster when values are large
                                  #  - keep original array in its original order
                                  #Cons:
                                  #  - O(n) additional space
                                  #  - slower to access values due to indirection, i.e. slower when accesses are expensive compared to moves

                                  ┌──────────────────┐
                                  │   BUCKET SORTS   │
                                  └──────────────────┘

BUCKET SORT ==>                   #Also called "bin sort"
                                  #Type of distribution sort
                                  #Steps:
                                  #  - "partition":
                                  #     - create k buckets
                                  #     - time complexity O(k)
                                  #     - space complexity O(k)
                                  #  - "scatter|disperse":
                                  #     - assign each value to a bucket|bin|category
                                  #     - each value must have a key used to find bucket in O(1)
                                  #        - e.g. first bucket for values < 100, second bucket for values 100 < x < 200, etc.
                                  #     - time complexity O(n)
                                  #     - space complexity O(n)
                                  #  - sort each bucket's values
                                  #     - using any sort algorithm
                                  #     - can also apply other sort algorithm on all buckets at once
                                  #        - if it is more performant for that algorithm
                                  #        - e.g. insertion sort which is fast when values do not move much
                                  #  - "gather|collect": concatenate all bucket's values
                                  #     - time complexity O(k)
                                  #Most time complexity is from sorting each bucket's values
                                  #  - i.e. better if buckets have few values
                                  #  - i.e. either:
                                  #     - evenly distributed
                                  #     - choosing different bucket sizes to match the distribution of the values,
                                  #       if the distribution is known
                                  #Pros:
                                  #  - best time complexity: O(n + k)
                                  #  - average time complexity: O(n + k + n*n/k)
                                  #  - stable
                                  #  - parallelizable
                                  #  - online, providing know distribution of values in advance
                                  #Cons:
                                  #  - worst time complexity: O(n**2 + k)
                                  #     - if not evenly distributed, e.g. all values in single bucket
                                  #     - i.e. requires being able to distribute evenly with a low amount of buckets
                                  #        - or matching the distribution of the values
                                  #  - not in-place, space complexity: O(n + k)
                                  #  - not adaptive

GENERIC BUCKET SORT ==>           #Bucket sort with:
                                  #  - even buckets
                                  #  - ranging from min|max of the keys
                                  #Example:
                                  #  - [52, 50, 70, 20, 10]
                                  #    -> [[], [], [], [], [], [], [], [], [], []] (partition)
                                  #    -> [[], [10], [20], [], [], [52, 50], [], [70], [], []] (scatter)
                                  #    -> [[], [10], [20], [], [], [50, 52], [], [70], [], []] (second sort)
                                  #    -> [10, 20, 50, 52, 70] (gather)

PIGEONHOLE SORT ==>               #Bucket sort where values inside each bucket are sort-equivalent.
                                  #Done by creating as many buckets as possible
                                  #  - e.g. when sorting integers, each bucket is exactly one integer
                                  #Pros|cons like bucket sort except:
                                  #  - pro:
                                  #     - average|worst time complexity O(n + k): because no second sort step
                                  #  - cons:
                                  #     - high space complexity due to empty buckets
                                  #     - higher time complexity when partioning and gathering, since number of buckets higher
                                  #     - i.e. best when number of buckets is close to number of values
                                  #Example:
                                  #   - [c, f, b, b, e]
                                  #  -> [[], [], [], [], [], []] (partition)
                                  #  -> [[], [b, b], [c], [], [e], [f]] (scatter)
                                  #  -> [b, b, c, e, f] (gather)

COUNTING SORT ==>                 #Also called "histogram sort"
                                  #Same as pigeonhole sort except each bucket contains only the number of occurrences,
                                  #not the values themselves.
                                  #  - i.e. the final array is produced by cumulating each bucket's offset,
                                  #    not by gathering the values from each bucket
                                  #Similar pros|cons as pigeonhole sort but:
                                  #  - pros:
                                  #     - low space complexity: by using a count of values instead of repeated values
                                  #     - faster time complexity: does not require moving values
                                  #  - cons:
                                  #     - pros only apply when number of buckets is much lower than number of values
                                  #Example:
                                  #   - [c, f, b, b, e]
                                  #                                         a  b    c  d  e  f
                                  #                                     -> [0, 0,   0, 0, 0, 0] (partition)
                                  #                                     -> [0, 2,   1, 0, 1, 1] (sums)
                                  #                                     -> [0, 0-1, 2, 3, 3, 4] (offsets)
                                  #  -> [2, 4, 0-1, 0-1, 3] (indices)
                                  #  -> [2, 4, 0, 1, 3]
                                  #  -> [b, b, c, e, f]

PROXMAP SORT ==>                  #Variant of bucket sort
                                  #When creating buckets, their number and size depend on values:
                                  #  - a FUNC(value)->INT reduces each value to an approximate overall position ("map key")
                                  #     - e.g. Math.floor(value % modulo)
                                  #     - 1 bucket === 1 map key
                                  #     - must be choosen:
                                  #        - both not enough buckets, and too many empty ones, lead to higher time complexity
                                  #        - i.e. must try to approach 1 bucket === 1 value
                                  #  - create a cumulative histogram ("proxmap") like counting sort does
                                  #  - time complexity: O(n)
                                  #When scattering values, and bucket > 1 value, use insertion sort.
                                  #Similar pros|cons as bucket sort but:
                                  #  - pros:
                                  #     - higher chance to distribute evenly
                                  #     - more space efficient, by lowering chances of empty buckets
                                  #     - allows proxmap search
                                  #  - cons:
                                  #     - additional cost of partitioning
                                  #     - finding the right map key FUNC depends on knowing the values distribution
                                  #Example:
                                  #   - [12, 6, 56, 24, 22, 58]                                                                0 10 20 30 40 50
                                  #                                                  -> [10, 0, 50, 20, 20, 50] (map)      -> [1, 1, 2, 0, 0, 2] (hit counts)
                                  #                                                                                        -> [0, 1, 2, 4, 4, 4] (proxmap)
                                  #                                                  -> [1,  0,  4,  2,  2 , 4] (location)
                                  #  -> [6, 12, [24, 22], [56, 58]] (scatter)
                                  #  -> [6, 12, [22, 24], [56, 58]] (insertion sort)
                                  #  -> [6, 12, 22, 24, 56, 58] (gather)

PROXMAP SEARCH ==>                #Fast searching based on proxmap sort:
                                  #  - apply the map key FUNC on a searched value
                                  #  - find start index using proxmap
                                  #  - if bucket had 1 value, correct index. Otherwise, do a linear|binary search forward
                                  #Time complexity:
                                  #  - best|average: O(1)
                                  #  - worst: O(n), if not evenly distributed, e.g. all values in single bucket
                                  #Example, re-using above example:
                                  #  - search 24
                                  #  - map key: 2
                                  #  - location: 2
                                  #  - search from index 2: 22 (not found), then 24 (found)

                                  ┌─────────────────┐
                                  │   RADIX SORTS   │
                                  └─────────────────┘

RADIX SORT ==>                    #Also called "digital sort"
                                  #Type of distribution sort
                                  #Sort symbol by symbol using another, when values can be broken down into sequences of w symbol
                                  #  - where each symbol order is more important than next one, i.e. lexicographic order
                                  #  - e.g. numbers|strings
                                  #Missing symbols must be filled with padding symbols
                                  #  - e.g. '0' for numbers or space for strings
                                  #  - with lower value than any other symbol
                                  #  - either on the left|highest or right|lowest symbol
                                  #Can use any other sorting algorithm for the symbols.
                                  #  - often used: counting sort, since it works well with digits|characters
                                  #Pros:
                                  #  - best|average time complexity: O(n * w)
                                  #  - space complexity: O(n + w)
                                  #     - low thanks to using a subset of the whole value in each step
                                  #     - i.e. works with large range of values
                                  #  - can be stable
                                  #Cons:
                                  #  - worst time complexity: O(n**2 * w)
                                  #     - when w too low, i.e. does not break down into enough symbols
                                  #     - i.e. best when values are sequences of symbols
                                  #  - not in-place
                                  #  - not online
                                  #  - not adaptive

LSD RADIX SORT ==>                #Also called "bottom-up radix sort"
                                  #Radix sort going from lowest to highest symbol.
                                  #Each new symbol:
                                  #  - must iterate over all values
                                  #  - must be stable, to keep sorting of next symbols
                                  #Padding usually on the left, so it can happen symbol by symbol
                                  #Instead of padding, can also group by length, then sort each group
                                  #Pros|cons like radix sort, with also:
                                  #  - cons:
                                  #     - not parallelizable
                                  #     - due to padding, best for numbers
                                  #Example:
                                  #   - 7 71 12 1
                                  #  -> [71 1] [12] [7] (last digit)
                                  #  -> 71 1 12 7
                                  #  -> [01 07] [12] [71] (second last digit)

MSD RADIX SORT ==>                #Also called "top-down radix sort" or "postman's sort"
                                  #Radix sort going from highest to lowest symbol.
                                  #Each new symbol:
                                  #  - must iterate over values grouped by previous symbols
                                  #  - does not need to be stable
                                  #Padding usually on the right, so it can happen symbol by symbol
                                  #Pros|cons like radix sort, with also:
                                  #  - pros:
                                  #     - parallelizable
                                  #  - cons:
                                  #     - due to padding, best for strings
                                  #Example:
                                  #   - g ga ab abc a
                                  #  -> [ab abc a] [g ga] (first character)
                                  #  -> [[a.] [ab abc]] [[g.] [ga]] (second character)
                                  #  -> [[[a..]] [[ab.] [abc]]] [[[g..]] [[ga.]]] (third character)
                                  #  -> a ab abc g ga

AMERICAN FLAG SORT ==>            #Also called "in-place MSD radix sort"
                                  #Variant of MSD that is in-place:
                                  #  - compute count of each value for highest symbol
                                  #  - compute indices of group for each value, using each count as group size
                                  #  - swap values to put them inside right group
                                  #  - recurse on each group with next symbol
                                  #Pros|cons like MSD radix sort, with also:
                                  #  - pros:
                                  #     - in-place
                                  #     - more efficient when bucket size is power of 2, because can implement with bitwise logic
                                  #  - cons:
                                  #     - not stable
                                  #Example:
                                  #   - g ga ab abc a
                                  #  -> a:3 g:2 (first character)
                                  #  -> [ab abc a] [g ga] (in-place swap)
                                  #  -> recurse

                                  ┌─────────────────────┐
                                  │   INSERTION SORTS   │
                                  └─────────────────────┘

INSERTION SORT ==>                #Type of comparison sort
                                  #Insert each value, one at a time, to its sorted position in output array
                                  #  - i.e. to its first position where next value is bigger
                                  #Pros:
                                  #  - in-place, space complexity O(1): by using growing sub-array at beginning of array
                                  #  - online
                                  #  - adaptive:
                                  #     - best time complexity O(kn) when each element is ~k places away from sorted place
                                  #     - i.e. O(n) if already sorted
                                  #  - stable
                                  #  - high locality of reference
                                  #     - i.e. fast when n is low
                                  #  - simple to implement
                                  #Cons:
                                  #  - average|worst time complexity O(n**2): due to having to move half of other values at each newly inserted value
                                  #  - not parallelizable
                                  #Example:
                                  #   - 4 2 1 3 5
                                  #  -> 2 4 1 3 5
                                  #  -> 1 2 4 3 5
                                  #  -> 1 2 3 4 5
                                  #  -> 1 2 3 4 5

SIMPLE INSERTION SORT ==>         #Insertion sort where each insertion do a series of swaps until finding position
                                  #Pro: very high locality of reference
                                  #Con: does many writes

LINEAR INSERTION SORT ==>         #Insertion sort where each insertion uses a linear search, then a single swap.
                                  #Same pros|cons as binary insertion sort, except better for very small array size

BINARY INSERTION SORT ==>         #Insertion sort where each insertion uses a binary search, then a single swap.
                                  #Pros|cons compared to swap insertion sort:
                                  #  - pro: fewer comparisons, i.e. good if those are expensive
                                  #  - con: less adaptive

SHELL SORT ==>                    #How:
                                  #  - using "gap" k
                                  #  - divide array into k subarrays, striding through values
                                  #     - e.g. with k 3: [0, 1, 2, 3, 4] -> [0, 3], [1, 4], [2]
                                  #  - insertion sort on each subarray
                                  #Repeat above with "gap" k from n/2 to 1, or vice-versa
                                  #  - often used:
                                  #     - n/(2**i): original one, not as good
                                  #     - 2n/(2**(i+1)) + 1
                                  #     - 2**i - 1
                                  #     - 1 then 2**i + 1
                                  #     - (3**i - 1)/2, up until n/3
                                  #Similar pros|cons as insertion sort but:
                                  #  - pros:
                                  #     - average time complexity: O(n ** 4/3) usually (depends on gaps)
                                  #     - worst time complexity: O(n ** 3/2) usually (depends on gaps)
                                  #  - cons:
                                  #     - best time complexity: O(n log n), due to sorting far elements first
                                  #     - not stable

                                  ┌────────────────────┐
                                  │   EXCHANGE SORTS   │
                                  └────────────────────┘

BUBBLE SORT ==>                   #Also called "sinking sort"
                                  #Type of comparison sort
                                  #How:
                                  #  - for each value, swap with next value if next value is smaller
                                  #  - repeat until sorted
                                  #Similar pros|cons as insertion sorts except:
                                  #  - pros:
                                  #     - does not require random access
                                  #  - cons:
                                  #     - usually slower, due to each value shifting to its sorted position with several swaps

ODD-EVEN SORT ==>                 #Also named "odd-even transposition sort", "brick sort" or "parity sort"
                                  #Like bubble sort except:
                                  #  - iterate over each value pair, not over each value
                                  #  - start index of each round alternates between 1 and 0
                                  #Similar pros|cons as bubble sorts except:
                                  #  - pros:
                                  #     - parallelizable
                                  #  - cons:
                                  #     - slower for partially sorted arrays, since shifting a single value requires many
                                  #       rounds instead of a single one

EXCHANGE SORT ==>                 #Type of comparison sort.
                                  #Similar to selection sort, but using swaps to find each minimum value.
                                  #For each value x:
                                  #  - iterate over each next value y
                                  #  - swap if x < y
                                  #Similar pros|cons as bubble sorts except:
                                  #  - pros:
                                  #     - faster on non-presorted input, i.e. better worst-case time complexity
                                  #  - cons:
                                  #     - not adaptive, i.e. poorer best-case time complexity

                                  ┌─────────────────────┐
                                  │   SELECTION SORTS   │
                                  └─────────────────────┘

SELECTION SORT ==>                #Type of comparison sort.
                                  #Find then append minimum remaining value, one at a time.
                                  #Similar pros|cons as insertion sorts except:
                                  #  - pros:
                                  #     - faster when writes|swaps are expensive compared to reads|comparisons
                                  #        - because do fewer writes|swaps and more reads|comparisons
                                  #  - cons:
                                  #     - best time complexity O(n**2): not adaptive
                                  #     - average|worst time complexity O(n**2): usually slower
                                  #     - not stable
                                  #     - not online
                                  #Example:
                                  #   - 4 2 1 3 5
                                  #  -> 1 4 2 3 5
                                  #  -> 1 2 4 3 5
                                  #  -> 1 2 3 4 5
                                  #  -> 1 2 3 4 5

PARTIAL SELECTION SORT ==>        #Selection sort that is partial by stopping after k values

BINGO SORT ==>                    #Like selection sort, except if there are multiple equal minimum values, they are all moved at once
                                  #Similar pros|cons as selection sorts except:
                                  #  - pro: faster when there are equal values
                                  #  - con: slightly slower otherwise

                                  ┌──────────────┐
                                  │   HEAPSORT   │
                                  └──────────────┘

HEAPSORT ==>                      #Like selection sort but build a heap first to retrieve minimum values.
                                  #  - pro: O(log n) instead of O(n) to retrieve minimum
                                  #  - con: additional O(n log n) to build heap
                                  #Can be done in-place:
                                  #  - building a heap can be done in-place
                                  #  - when extracting each root (i.e. minimum value) can:
                                  #     - place it in a growing sorted array at end
                                  #     - swap with the last value, which becomes new root and is moved down the heap until satisfying the heap property
                                  #Pros:
                                  #  - general purpose, i.e. few constraint on values
                                  #  - best|average time complexity: O(n log n)
                                  #  - worst time complexity O(n log n)
                                  #     - best and worst performance close to each other
                                  #  - in-place, space complexity O(1)
                                  #Cons:
                                  #  - not online
                                  #  - not adaptive
                                  #  - not stable
                                  #  - not parallelizable
                                  #  - poor locality of reference

TOURNAMENT SORT ==>               #Like a heapsort but incremental, with heap size < values size:
                                  #  - heap is built bottom-up
                                  #  - incrementally:
                                  #     - top|min value is picked
                                  #     - a new bottom value is added
                                  #  - if a new bottom value < last min value, it is frozen:
                                  #     - still kept as a bottom leaf
                                  #     - but cannot move nor be picked as min value
                                  #  - when either:
                                  #     - all bottom values are frozen
                                  #     - no more new values
                                  #  - then:
                                  #     - sort frozen values using the heap
                                  #     - but output to a separate array
                                  #  - repeat everything with a new heap and a new output array
                                  #  - this finally results in multiple sorted arrays to combine with a k-way natural merge sort
                                  #     - can be done incrementally
                                  #Similar pros|cons as heapsort but:
                                  #  - pros:
                                  #     - online
                                  #     - space complexity O(m) with m being heap size, as opposed to O(n)
                                  #       (heapsort when not in-place)
                                  #  - cons:
                                  #     - not in-place
                                  #     - slower average time complexity due to additional logic

PARTIAL HEAP SORT ==>             #Heap sort that is partial by:
                                  #  - using a max-heap with a fixed size k
                                  #  - ignoring values that do not fit in the heap
                                  #Pros:
                                  #  - best|average|worst time complexity O(n log k)
                                  #     - i.e. better when k is small


                                  ┌───────────────┐
                                  │   QUICKSORT   │
                                  └───────────────┘

QUICKSORT ==>                     #Also called "partition-exchange sort"
                                  #Type of comparison sort
                                  #How:
                                  #  - choose one of the values as "pivot" ("selection")
                                  #  - move values so that all values < pivot come before other ones ("partition")
                                  #  - recurse over each group
                                  #     - excluding the pivot value, as an optimization, since known to be now sorted
                                  #Pros:
                                  #  - general purpose, i.e. few constraint on values
                                  #  - good best|average time complexity: O(n log n)
                                  #     - O(log n) depth levels
                                  #        - optimally low when pivot is close to median value, resulting in equal splits
                                  #     - and O(n) at each depth level
                                  #  - good locality of reference
                                  #     - faster than heapsort or merge sort
                                  #  - in-place
                                  #  - parallelizable, but not as well as other sorting algorithms due to:
                                  #     - high recursion depth level
                                  #     - harder to parallelize when in-place
                                  #  - adaptive
                                  #Cons:
                                  #  - bad worst time complexity: O(n**2)
                                  #     - when pivot is always min|max value
                                  #     - i.e. effort on choosing right pivot selection
                                  #  - space complexity: O(log n), since each depth level keeps track of some information, e.g. of pivot
                                  #  - slower on small amount of values
                                  #  - not stable
                                  #  - not online

LOMUTO QUICKSORT ==>              #Quicksort partitioning by iterating over each value:
                                  #  - if <= pivot, move it to beginning of array
                                  #  - do each move by swapping values, then increasing index of where next swap will occur
                                  #Usually select array's last value as pivot.
                                  #Cons:
                                  #  - time complexity O(n**2) for either:
                                  #     - already sorted input
                                  #     - only equal values

HOARE QUICKSORT ==>               #Quicksort partitioning by iterating over values on both sides:
                                  #  - find first value at start >= pivot
                                  #  - find first value at end <= pivot
                                  #  - swap those values
                                  #  - repeat
                                  #Usually select array's middle value as pivot.
                                  #Pros:
                                  #  - more efficient than Lomuto quicksort:
                                  #     - swapping puts two values in right position instead of one
                                  #     - time complexity O(n log n) for either already sorted input, or only equal values

SELECTION QUICKSORT ==>           #Quicksort using a selection algorithm to select the pivot

RANDOMIZED QUICKSORT ==>          #Quicksort using a random value as pivot
                                  #Pros:
                                  #  - worst time complexity still O(n**2) but much more unlikely
                                  #Cons:
                                  #  - slower best|average time complexity
                                  #  - not adaptive
                                  #Can also shuffle values initially:
                                  #  - same pros|cons as above, but stronger
                                  #  - also allow picking first value as pivot instead, which is faster

MEDIAN OF MEDIANS QUICKSORT ==>   #Also named "BFPRT"
                                  #Quicksort using median of medians to select pivot.
                                  #Pros|cons similar to randomized quicksort, but:
                                  #  - pros:
                                  #     - even more improved worst time complexity: O(n)
                                  #     - adaptive
                                  #  - cons:
                                  #     - even slower best|average time complexity

SEDGEWICK QUICKSORT ==>           #Quicksort selecting the median of the first, middle and last element as pivot.
                                  #I.e. good heuristics without knowing if the array is sorted, reversed, or neither.

MULTI-PIVOT QUICKSORT ==>         #Also named "multiquicksort"
                                  #Quicksort with multiple pivots.
                                  #Pros:
                                  #  - lower depth level:
                                  #     - better for external sorting
                                  #     - faster if moving|writting values is expensive
                                  #     - even more parallelizable
                                  #Cons:
                                  #  - slower average time complexity

EXTERNAL DISTRIBUTION SORT ==>    #Multi-pivot quicksort, when used for external sorting

THREE-WAY PARTITION QUICKSORT ==> #3-pivot quicksort not recursing on middle partition, i.e. on values == pivot
                                  #Pros:
                                  #  - best time complexity O(n) when all values are equal
                                  #Cons:
                                  #  - slower when values are rarely equal

PARTIAL QUICKSORT ==>             #Also called "quickselsort"
                                  #Quicksort that is partial, by first using quickselect to only sort the desired sub-array

                                  ┌─────────────────┐
                                  │   MERGE SORTS   │
                                  └─────────────────┘

MERGE SORT ==>                    #Type of comparison sort
                                  #Divide array into many groups of 1 value ("initial grouping") then:
                                  #  - merge each pair of groups into a single sorted group, by comparing their values in order ("merge step")
                                  #  - repeat with double group size ("passes")
                                  #When merging each pair of groups, must create a temporary array with the result:
                                  #  - cannot do in-place
                                  #  - must be same size as array
                                  #  - optimization: can re-use same array for each depth level
                                  #Pros:
                                  #  - general purpose, i.e. few constraint on values
                                  #  - best|average|worst time complexity: O(n log n)
                                  #     - best:
                                  #        - when for each pair of group, all values in one group are > the other
                                  #           - because only needs to iterate through the values of one of the groups
                                  #        - i.e. when already sorted or reverse sorted
                                  #        - i.e. adaptive
                                  #     - average
                                  #        - when uniformly random, ~74% of worst time complexity
                                  #     - worst:
                                  #        - opposite of best case, i.e. when values alternate
                                  #           - specific sequence is known and called the "sorting numbers"
                                  #        - twice slower
                                  #  - very good locality of reference
                                  #  - very parallelizable
                                  #  - stable
                                  #  - does not require random access
                                  #Cons:
                                  #  - space complexity O(n): due to temporary array
                                  #  - not in-place

TOP-DOWN MERGE SORT ==>           #Merge sort where initial grouping:
                                  #  - splits whole array into 2 groups
                                  #  - which are recursively split until creating groups of 1 value

BOTTOM-UP MERGE SORT ==>          #Merge sort where initial grouping:
                                  #  - directly divides whole array into many groups of 1 value
                                  #Similar time|space complexity as top-down merge sort

NATURAL MERGE SORT ==>            #Merge sort where initial grouping:
                                  #  - Uses presorted parts of the array ("runs")
                                  #  - By going through whole array and group sequences that monotonically increase|decrease
                                  #Pros:
                                  #  - best time complexity O(n): if whole array already sorted
                                  #  - slower average time complexity: due to looking for runs, i.e. not good if array not presorted

ONLINE MERGE SORT ==>             #Can be online:
                                  #  - sort new values using any sorting algorithm
                                  #  - then do natural merge sort

LINKED LIST MERGE SORT ==>        #Merge sort that stores each value into linked lists instead of arrays.
                                  #Pros:
                                  #  - space complexity O(1): values are moved between linked lists instead of being copied
                                  #Cons:
                                  #  - slower time complexity: due to extra cost of linked lists logic

PING PONG MERGE SORT ==>          #Merge sort that, at each new depth level, swaps which array is considered the temporary and original one.
                                  #Pros:
                                  #  - faster time complexity: avoids moving values back from the temporary to the original array

K-WAY MERGE SORT ==>              #Also named "multiway merge sort"
                                  #Merge sort where merge step merges multiple groups.
                                  #As opposed to merging pairs of groups ("2-way|binary merge sort"):
                                  #  - pros:
                                  #     - even more parallelizable
                                  #     - lower depth level:
                                  #        - even no recursion if k == number of groups
                                  #           - no temporary array
                                  #           - still need space complexity O(n) for the output array
                                  #        - better for external sorting:
                                  #           - faster if moving|writting values is expensive
                                  #     - can optimize performance when groups have different sizes
                                  #  - cons:
                                  #     - otherwise, slower average time complexity

ITERATIVE K-WAY MERGE SORT ==>    #K-way merge sort which merges multiple groups together 2 by 2.
                                  #Optimization: merge smaller arrays together first.

DIRECT K-WAY MERGE SORT ==>       #K-way merge sort which merges multiple groups together all at once.

LINEAR K-WAY MERGE SORT ==>       #Direct k-way merge sort that does a linear search for max value.
                                  #Pro: fast on small k
                                  #Con: average|worst time complexity O(n)

HEAP K-WAY MERGE SORT ==>         #Direct k-way merge sort that:
                                  #  - uses a heap with the min element of each group
                                  #  - to find the min element of all groups
                                  #Pro: best|average|worst time complexity O(log n)

TOURNAMENT K-WAY MERGE SORT ==>   #Like heap direct k-way merge sort, but using loser tree instead (see its doc)
                                  #Pros|cons similar to heap k-way merge sort but:
                                  #  - pro: faster average time complexity

PARALLEL RECURSION MERGE SORT ==> #Merge sort where sibling merges are performed in parallel

PARALLEL MERGING MERGE SORT ==>   #Merge sort where each merge step is broken down into multiple merges performed in parallel
                                  #Since merge step operates on two presorted groups:
                                  #  - use middle element on one group
                                  #     - longer one if not same size
                                  #  - find corresponding element in other group
                                  #     - e.g. with binary search
                                  #  - merge each half of each group with other, then concatenate

NEAREST SMALLER VALUES MERGE SORT #Merge sort where merge step uses "all nearest smaller values" algorithm (see its doc)
 ==>                              #  - concatenate first group with reversed second group
                                  #  - calculate all nearest smaller values, in both directions
                                  #  - for each value, bigger neighbor should be previous value in merged array

                                  ┌───────────────┐
                                  │   BEAD SORT   │
                                  └───────────────┘

BEAD SORT ==>                     #Also called "gravity sort"
                                  #Neither comparison|distribution sort.
                                  #Keys must be positive integers
                                  #Similar to using an abacus and mapping beads fall
                                  #Steps:
                                  #  - put numbers in a matrix: each number n is a row, filled with n 1s
                                  #  - remove all vertical gaps (i.e. make 1s "fall")
                                  #  - each row will represent a sorted number
                                  #Example:
                                  #  - [2, 4, 1, 3, 3]
                                  #    -> [1, 1, 0, 0] (2)
                                  #       [1, 1, 1, 1] (4)
                                  #       [1, 0, 0, 0] (1)
                                  #       [1, 1, 1, 0] (3)
                                  #       [1, 1, 1, 0] (3)
                                  #    -> [1, 0, 0, 0] (1)
                                  #       [1, 1, 0, 0] (2)
                                  #       [1, 1, 1, 0] (3)
                                  #       [1, 1, 1, 0] (3)
                                  #       [1, 1, 1, 1] (4)
                                  #Another way to look at it:
                                  #  - create array with length === max number
                                  #  - for each number n, increment n first items of array
                                  #  - for each array item, replace by difference with next item
                                  #  - for each array item m, replace by m times its 1-based index
                                  #Example:
                                  #  - [2, 4, 1, 3, 3]
                                  #    -> [0, 0, 0, 0]
                                  #    -> [1, 1, 0, 0] (2)
                                  #    -> [2, 2, 1, 1] (4)
                                  #    -> [3, 2, 1, 1] (1)
                                  #    -> [4, 3, 2, 1] (3)
                                  #    -> [5, 4, 3, 1] (3)
                                  #    -> [1, 1, 2, 1] (difference)
                                  #    -> [1, 2, 3, 3, 4] (indexes)
                                  #Pros:
                                  #  - best time complexity: O(n)
                                  #Cons:
                                  #  - only works with positive integers
                                  #     - i.e. stable
                                  #  - not in-place, high space complexity: O(n**2)
                                  #  - average|worst time complexity: O(S), with S being sum of all keys
                                  #  - best time complexity usually requires specialized hardware
                                  #  - not parallelizable
                                  #  - not online
                                  #  - not adaptive

                                  ┌────────────┐
                                  │   HYBRID   │
                                  └────────────┘

MULTI-KEY QUICKSORT ==>           #Also named "three-way radix quicksort"
                                  #Radix sort using quicksort as inner sorting algorithm.
                                  #Often use Sedgewick quicksort due to high probability of equal values

HYBRID INSERTION SORT ==>         #Any sort algorithm that switches to insertion sort when log number of elements
                                  #Due to good performance on low array size
                                  #Often used threshold: 16 elements

TILED MERGE SORT ==>              #Hybrid insertion sort with merge sort.
                                  #I.e. insertion sort on each group of m values, then natural merge sort

INTROSORT ==>                     #Also named "introspective sort"
                                  #Hybrid insertion sort with mix of:
                                  #  - quicksort:
                                  #     - at start
                                  #     - due to good best|average time complexity
                                  #  - heapsort
                                  #     - if quicksort too slow (see introselect doc)
                                  #     - i.e. improved worst time complexity

                                  ┌─────────┐
                                  │   BAD   │
                                  └─────────┘

BOGOSORT ==>                      #Also called "permutation sort", "stupid sort", "slowsort", "shotgun sort", "monkey sort"
                                  #Randomly shuffles whole array until shuffle output happens to be also sorted
                                  #  - or alternatively, try every permutation
                                  #Only meant as an example of a bad sorting algorithm
                                  #Pros:
                                  #  - in-place, space complexity: O(1)
                                  #Cons:
                                  #  - time complexity:
                                  #     - best: O(n): if already sorted
                                  #     - average: O(n * n!)
                                  #     - worst: O(n * n!) (if try every permutation) or infinite (otherwise)
                                  #  - not stable
                                  #  - not parallelizable
                                  #  - not online
                                  #  - not adaptive

                                  ┌────────────────────────┐
                                  │   COMPARATOR NETWORK   │
                                  └────────────────────────┘

COMPARATOR NETWORK ==>            #Only operation is comparison:
                                  #  - between 2 values at specific indices
                                  #  - swap them if first > second
                                  #Input size is usually fixed
                                  #Either:
                                  #  - to sort values: "sorting network"
                                  #     - type of comparison sort
                                  #     - also called "kruskal hub"
                                  #  - to select k-th value (selection algorithm)
                                  #2 comparisons can be parallelized if comparing different indices
                                  #  - especially in hardware
                                  #Time complexity:
                                  #  - number of comparisons ("size"):
                                  #     - fixed
                                  #     - either:
                                  #        - total: O(n log2 n) for best algorithms
                                  #        - parallel ("rounds" or "stages")
                                  #  - number of swaps:
                                  #     - depends on input
                                  #     - i.e. has
                                  #        - best: always 0
                                  #        - average
                                  #        - worst ("depth"): O(log2 n) for best algorithms
                                  #Pros:
                                  #  - can be implemented in:
                                  #     - high-level software
                                  #     - assembly
                                  #     - hardware
                                  #  - very fast, by simplifying implementation
                                  #  - very parallizable
                                  #  - in-place, space complexity O(1), 0 memory
                                  #  - adaptive
                                  #Cons:
                                  #  - fixed array size
                                  #  - similar to loop unrolling, no logic re-use between each comparison
                                  #     - low branch locality of reference
                                  #     - lots of code and impractical with high array size
                                  #  - not online

KNUTH DIAGRAMS ==>                #Representation of a comparator network
                                  #Horizontal lines ("wires") are each value
                                  #Vertical lines ("connectors") are comparisons
                                  #  - swap is if top < bottom
                                  #  - can reverse direction with an arrow or by crossing lines, but try to avoid for clarity

0-1 PRINCIPLE ==>                 #To know if comparator network works, enough to use only 0-1 as values,
                                  #i.e. bitstrings, as opposed to all possible numbers

UNI-DIRECTIONAL                   #For each swap first < second, first index < second index
 COMPARATOR NETWORK ==>           #As opposed to "bi-directional"

RECURSIVE SORTING NETWORK ==>     #Sorting network for arbitrary input length

OPTIMAL SORTING NETWORK ==>       #Sorting network with a proven min size and depth
                                  #Known only for n <= 17
                                  #For n 13-17, network for min size and min depth are different.
                                  #Optimal:
                                  #  - n 1:
                                  #     - depth|size 0
                                  #     - no sorting needed
                                  #  - n 2:
                                  #     - depth|size 1
                                  #     - single comparison
                                  #  - n 3:
                                  #     - depth|size 3
                                  #     - compare 0-1, 1-2, 0-1
                                  #  - n 4:
                                  #     - depth 3, size 5
                                  #     - compare 0-1 2-3, 0-2 1-3, 1-2
                                  #  - n 5-17: depth ~floor((n+5)/2), size ~(n*log2(n))-4

HALF-CLEANER ==>                  #Building block used by multiple sorting networks
                                  #Multiple comparisons in a single round
                                  #Over n wires
                                  #  - can be lower than total amount of wires
                                  #Compare each i with i+n/2
                                  #  - i.e. if 2 wires, just a single comparison

BITONIC SORT ==>                  #Type of sorting network
                                  #Use similar logic as merge sort, i.e. binary divide-and-conquer
                                  #How:
                                  #  - sort each half recursively
                                  #     - but second half in reverse
                                  #        - by reversing comparisons' direction
                                  #     - i.e. both halves are "bitonic" (one increases, the other decreases)
                                  #  - concatenate both halves then sort:
                                  #     - use 1 half-cleaner over whole array, then 2 half-cleaners over each half, and so on
                                  #Can make algorithm uni-directional instead by:
                                  #  - sorting second halves normally (not in reverse)
                                  #  - when merging halves, making first half-cleaner (not next ones) compare each i with n-i instead
                                  #Pros:
                                  #  - recursive
                                  #  - low rounds, low depth: log2(n) * (log2(n) + 1) / 2
                                  #Cons:
                                  #  - not stable
                                  #  - high size:
                                  #     - n * log2(n) * (log2(n) + 1) / 4
                                  #     - ~ O(n log n)
                                  #  - input length must be 2**m

PAIRWISE SORTING NETWORK ==>      #Type of sorting network
                                  #Pros:
                                  #  - recursive
                                  #Cons:
                                  #  - not stable
                                  #  - higher number of comparisons

BOSE-NELSON SORT ==>              #Type of sorting network
                                  #Pros:
                                  #  - recursive
                                  #Cons:
                                  #  - not stable
                                  #  - higher number of comparisons

MERGE-EXCHANGE SORT ==>           #Type of sorting network
                                  #Pros:
                                  #  - low number of comparisons
                                  #  - recursive
                                  #Cons:
                                  #  - not stable

BUBBLE|INSERTION SORTING NETWORK  #Similar to bubble|insertion sort, but for sorting networks
  ==>                             #How:
                                  #  - compare each element with next, until end - k
                                  #  - repeat with different k
                                  #k can go:
                                  #  - from 0 to n: "bubble sorting network"
                                  #  - from n to 0: "insertion sorting network"
                                  #  - when parallelized, those two are identical
                                  #Pro:
                                  #  - stable
                                  #  - recursive
                                  #Con: high:
                                  #  - size: (n-1)^2, i.e. O(n^2)
                                  #  - rounds: 2n - 3
                                  #  - depth: 2n - 3