-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathintroduction.tex
217 lines (165 loc) · 7.53 KB
/
introduction.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
\section{Introduction - Revision}
\subsection{Rates of Growth}
\paragraph{The Problem}
To analyse algorithms, we need a way to compare two functions representing
the runtime of each algorithm. However, comparing values directly presents
a few issues.
\begin{itemize}
\item Outliers may affect the results.
\item One algorithm may be slower for a set period of time, but catch
up after a while.
\item The runtime of an algorithm may vary depending on the
implementation or the architecture of the machine where, some
instructions are faster than others.
\end{itemize}
\paragraph{Asymptotic Growth}
For algorithms, we often prefer to refer to them in terms of their
asymptotic growth in runtime, with relation to the input size.
A function that quadruples with every extra input will always have a
greater runtime than one that increases linearly with each new input, for
some large enough input size.
\paragraph{Big \(O\) Notation}
We say \(f(n) = O(g(n))\) if there exists a positive constants \(C, N\)
such that
\[
0 \leq f(n) \leq C g(n) \quad \forall n \geq N.
\]
We may refer to \(g(n)\) to be the asymptotic upper bound for \(f(n)\).
\paragraph{Big Omega Notation}
We say \(f(n) = \Omega(g(n))\) if there exists positive constants \(c, N\)
such that
\[0 \leq c g(n) \leq f(n) \quad \forall n \geq N.\]
Then, \(g(n)\) is said to be an asymptotic lower bound for \(f(n)\).
It is useful to say that a problem is at least \(\Omega(g(n))\).
\paragraph{Landau Notation}
\(f(n) = \Omega (g(n))\) if and only if \(g(n) = O(f(n))\).
There are strict version of Big \(O\) and Big \(Omega\) notations; these
are little \(o\) and little \(\omega\) respectively.
We say \(f(n) = \Theta(g(n))\) if
\[
f(n) = O(g(n)) \text{ and } f(n) = \Omega(g(n)).
\]
That is, both \(f\) and \(g\) have the same asymptotic growth.
\paragraph{Logarithms}
Logarithms are defined so that for \(a, b > 0\) where \(n \neq 1\),
let \[ n = \log_a b \Leftrightarrow a^n = b.\]
They have the following properties:
\begin{itemize}
\item \(a^{\log_a n} = n\)
\item \(\log_a (mn) = \log_a (m) + \log_a(n)\)
\item \(\log_a (n^k) = k\log_a (n)\)
\end{itemize}
By the change of base,
\[
\log_a(x) = \frac{\log_b (x)}{\log_b (a)}.
\]
As such, the denominator is constant in terms of \(x\) and so all log bases are equivalent under Big Theta notation.
\subsection{Assumed Data Structures}
\paragraph{Arrays}
We assume static arrays (though, it is possible to extend to dynamic arrays).
\begin{itemize}
\item We assume random-access in \(O(1)\)
\item Insert / delete \(O(n)\)
\item Search: \(O(n)\) - \(\log n\) if sorted
\end{itemize}
\paragraph{Linked Lists}
We assume the linked lists are doubly-linked since the \(2\times \)
overhead is negligible.
\begin{itemize}
\item Accessing next / previous: \(O(1)\)
\item Insert / delete to head or tail: \(O(1)\)
\item Search: \(O(1)\).
\end{itemize}
\paragraph{Stacks}
Last in, first out.
\begin{itemize}
\item Accessing top: \(O(1)\)
\item Insert / delete from top: \(O(1)\)
\end{itemize}
\paragraph{Queue}
First in, first out.
\begin{itemize}
\item Access front: \(O(1)\)
\item Insert front: \(O(1)\)
\item Delete front: \(O(1)\)
\end{itemize}
\paragraph{Hash Tables}
Store values by their hashed keys. Ideally, no two keys will hash to the
same value, however this may not be guaranteed.
\begin{itemize}
\item Search is expected to be \(O(1)\) however, in the worst case, we
expect to have to search through all the values in \(O(n)\).
\item Insertion is expected to be \(O(1)\) however, in the worst case,
we expect to have to search through all the values in \(O(n)\).
\item Deletion follows the same pattern of \(O(1)\) expectation and \(O(n)\) worst case.
\end{itemize}
\paragraph{Binary Search Trees}
We store (comparable) keys (or key-value pairs) in a binary tree, where
each node has at most two children, the left and right. The value of a node
must be greater than the values of all its children in its left sub-tree
and greater than those in the right sub-tree.
\begin{itemize}
\item In the best case, the height of the tree is \(h = \log_2 n\).
Such a tree is \textit{balanced}. in the worst case however, the tree
is a long chain of height \(n\).
\item The average search is \(\log n\). If the tree is self-balancing
then search and other operations are guaranteed to be i \(\log n\).
\end{itemize}
\paragraph{Binary Heap}
A max-heap is such that the value of a node is greater than or, equal to
the value of all its children. A min-heap follows the same principle but in
reverse.
\begin{itemize}
\item Finding the maximum involves finding the top item in \(O(1)\)
\item Deleting the top item will also require re-balancing.
\item Re-balancing the heap requires \(\log n\) time.
\item Insertion also requires \(\log_n\) time.
\end{itemize}
\subsection{Algorithms}
\paragraph{Linear Search}
Given an array \(A\) of \(n\) integers. We may determine if a value is inside
\(A\), by searching through the array linearly. This occurs in \(O(n)\).
\paragraph{Sorted Arrays - Binary Search}
Given a sorted array \(A\) of \(n\) integers that are sorted by value. We
may determine if a value is inside \(A\), by binary search. We pick the
midpoint of the \(A\). If the midpoint \(m\) is the value desired, we
return. Otherwise, we continue to recursively search through the array by
halving the search space. If the \(m\) is greater than the value we desire,
then we search the right sub-array of the array; otherwise, we search the
left sub-array of the array
\paragraph{Decision Problems and Optimisation Problems}
Decision problems are of the form
\[\text{Given some parameters \(X\), can you \dots}\]
Optimisation problems are of the form
\[\text{What is the smallest \(X\) for which you can \dots}\]
\paragraph{Comparison Sorting}
We are given an array of \(n\) items. We may sort the array in \(O(n\log n)\)
at best.
\paragraph{Asymptotic Lower Bound on Comparison Sorting}
There are \(n!\) permutations of the array, of which there is \(1\) correct sort.
If we do \(k\) comparisons, then we are able to check \(2^k\) permutations of results from these comparisons and so, can at most, distinguish \(2^k\) permutation.
Thus, we need to perform comparisons such that
\(2^k \geq n!\).
That is,
\(k\geq \log_2 (n!)\).
Therefore, \[
k \geq \frac{n}{2}\log_2 \left(\frac{n}{2}\right)
\]
\subsection{Non-Comparison Sort}
\paragraph{Counting Sort}
Create another array \(B\) of size \(k\) where \(k\) is
the maximum size of an element.
We iterate through the array and count the number of times each element occurs, iterating the index of \(B\) that is equivalent to the value of our current element.
The time complexity is \(O(n + k)\) where \(O(k)\) space is required.
\paragraph{Bucket Sort}
Distribute the items into buckets \(1, \dots, k\) such that each bucket contains a range of items such that all items in bucket \(1\) are less than all the items in bucket \(2\) and so on. We may sort within that bucket with any algorithm and then concatenate all the buckets together.
\paragraph{Radix Sort}
We assume an array \(A\) of \(n\) keys, of \(k\) symbols.
We bucket the keys by their first symbol. Then, within each bucket we also classify those items by the next symbol recursively.
This happens in \(O(nk)\).
\paragraph{LSD Radix Sort}
Sort all keys by their last symbol, then sort all keys by their second last symbol, and so on.
The sorting algorithm at each stage must be stable.
% !TODO: Finish
The time is \(O(nk)\) with space complexity of \(O(n + k)\).
\subsection{Graphs}