Extended Algorithms and Programming Techniques
25 April 2021
The course is split into four topics
However for the sake of organising notes, I've included an extra section to cover knowledge not covered in the course's prerequisite (COMP2521), and split Linear Programming and Reductions into two sections.
If $a, b, c > 0$ then $$a^{\log_{b}c} = c^{\log_{b}a}$$ Proof
$$ \begin{align*} \log_{b}c \cdot \log_{b}a &= log_{b}a \cdot \log_{b}c \\ \log_{b}(a^{\log_{b}c}) &= \log_{b}(c^{\log_{b}a}) \\ a^{log_{b}c} &= c^{\log_{b}a} \end{align*} $$
Complex numbers $z = a + ib$ can be represented using
and satisfying: $$z = |z|e^{i \arg(z)} = |z|(\cos(\arg(z)) + i \sin(\arg(z))).$$
and
$$z^n = \left(|z|e^{i \arg(z)}\right)^n = |z|^ne^{in\arg(z)} = |z|^n(n\cos(\arg(z)) + i \sin(n\arg(z))).$$
Roots of unity of order $n$ are complex numbers which satisfy $z^n = 1$.
For $\omega_n = e^{i \frac{2\pi}{n}}$ and for all $k$ such that $0 \leq k \leq n - 1$, $$ (\omega_n^k)^n = ((\omega_n)^k)^n = (\omega_n)^{nk} = ((\omega_n)^n)^k = 1^k = 1.$$
If $k + m \geq n$ then $k + m = n + l$ for $l = (k + m) \mod(n)$ and we have $$\omega_n^k \omega_n^m = \omega_n^{k+m} = \omega_n^{n+l} = \omega_n^n \omega_n^l = 1 \cdot \omega_n^l = \omega_n^l \text{ where } 0 \leq l \leq n.$$
Therefore, the product of any two roots of unity of the same order is just another root of unity of the same order (this is not the case for addition, as the sum of two roots of unity is usually not another root of unity).
The Cancellation Lemma: $\omega_{kn}^{km} = \omega_n^m$, and its proof is below $$\omega_{kn}^{km} = (\omega_{kn})^{km} = \left(e^{i \frac{2\pi}{kn}}\right)^{km} = e^{i \frac{2\pi k m}{k n}} = e^{i \frac{2\pi m}{n}} = \left(e^{i \frac{2\pi}{n}}\right)^m = \omega_n^m.$$
Divide and conquer algorithms recursively break down a problem into two or more non overlapping subproblems, before merging them together to become easy to solve.
Common examples of Divide and Conquer algorithms include
Given 2 'big integers', i.e. a number, stored as an array, where the values at an the $i^{th}$ index is the $i^{th}$ digit of the integer, and there are $n$ digits, a sample solution would look like
# Naive method for big integer multiplication
def mul(A: list[int], B: list[int]) -> list[int]:
n, m = len(A), len(B)
C = [0] * (n + m)
# Apply multiplication
for i in range(n):
for j in range(m):
C[i + j] += A[i] * B[j]
# Add carry amount to next digit and take mod of 10
for i in range(n + m - 1):
C[i + 1] += C[i] // 10
C[i] %= 10
return C
and hence have a runtime of $\Theta(n^{2})$, where $n$ is the greater number of digits.
Say we want to find the solution to the multiplication of $$P(x) = a_1x + a_0,$$ $$Q(x) = b_1x + b_0,$$ a naive solution will require 4 multiplications
$$P(x)Q(x) = Ax^2 + (B + C)x + D$$ However we know that $$(a_1x + a_0)(b_1x + b_0) = (a_1b_1 + a_1b_0 + a_0b_1 + a_0b_0) = (A + B + C + D),$$ requiring 2 additions (which can be computed easily) and 1 multiplication. Hence to find the final multiplication, we only need three multiplications
Hence to find the final solution, we can do $$P(x)Q(x) = Ax^2 + (E - A - D)x + D$$
If we want to multiply $A$ with $B$, we can represent them as
$$ \begin{align*} A &= 10^{\frac{n}{2}}A_1 + A_0 \\ B &= 10^{\frac{n}{2}}B_1 + B_0. \end{align*} $$
With this, we can apply Karatsuba's method, where
$$ \begin{align*} x &= A_1B_1 \\ z &= A_0B_0 \\ y &= (A_1 + A_0)(B_1 + B_0) - x - z \\ AB &= 10^{n}x + 10^{\frac{n}{2}}y + z. \end{align*} $$
At each function call, there are 3 multiplications being performed (each of which is multiplied by the same algorithm), where the input size is halved in each recursive call, and the addition runs in $O(n)$. Therefore Karatsuba's method has the recurrence $$ T(n) = 3T\left(\frac{n}{2}\right) + O(n) = O(n^{\lg 3}). $$
We can view this representation of integers similar to that of polynomials. Say we are given polynomials $A(x)$, $B(x)$, where $$A(x) = a_nx^n + a_{n-1}x^{n-1} + \ldots + a_0$$ $$B(x) = b_nx^n + b_{n-1}x^{n-1} + \ldots + b_0.$$ We have $C(x) = A(x) \cdots B(x)$ of degree $2n$ $$C(x) = \sum_{j = 0}^{2n}c_jx^j = A(x)B(x) = \sum_{j=0}^{2n} \left( \sum_{i + k = j}a_ib_k \right)x^j$$ however, we want to find the coefficients of $c_j = \sum_{i+k=j}a_ib_k$ without performing $(n+1)^2$ many multiplications necessary to get all products of the form $a_ib_k$.
A naive way to multiply two polynomials together would be to multiply each term of the first polynomial with a term of the second polynomial. The runtime of this would be $O(n^2)$, where $n$ is the degree/ number of terms of the polynomials.
Another method would be to convert both polynomial into it's point value form, multiply those points and convert it back into the polynomial form. The fast Fourier transform (FFT) is an algorithm to calculate the Discrete Fourier transform (DFT) of a sequence, or the inverse (IDFT) in $O(n \log n)$.
A polynomial $A(x)$ of degree $n$ is uniquely determined by its values at any $n + 1$ distinct input values $x_0, x_1, \ldots, x_n$: $$A(x) \leftrightarrow \langle(x_0, A(x_0)), (x_1, A(x_1)), \ldots, (x_n, A(x_n))\rangle$$
For $A(x) = a_nx^n + a_{n-1}x^{n-1} + \ldots + a_0$, these values can be obtained via a matrix multiplication:
$$ \begin{pmatrix} 1 & x_0 & x_0^2 & \ldots & x_0^n \\ 1 & x_1 & x_1^2 & \ldots & x_1^n \\ \vdots & \ldots & \ldots & \ddots & \vdots \\ 1 & x_n & x_n^2 & \ldots & x_n^n \\ \end{pmatrix} \begin{pmatrix} a_0 \\ a_1 \\ \vdots \\ a_n \\ \end{pmatrix} = \begin{pmatrix} A(x_0) \\ A(x_1) \\ \vdots \\ A(x_n) \\ \end{pmatrix} $$
Such a matrix is called the Vandermonde matrix.
It can be shown that if $x_i$ are all distinct, then this matrix is invertible. Thus, if all $x_i$ are all distinct, given any values $A(x_0), A(x_1), \ldots, A(x_n)$ the coefficients $a_0, a_1, \ldots, a_n$ of the polynomial $A(x)$ are uniquely determined:
$$ \begin{pmatrix} 1 & x_0 & x_0^2 & \ldots & x_0^n \\ 1 & x_1 & x_1^2 & \ldots & x_1^n \\ \vdots & \ldots & \ldots & \ddots & \vdots \\ 1 & x_n & x_n^2 & \ldots & x_n^n \\ \end{pmatrix}^{-1} \begin{pmatrix} A(x_0) \\ A(x_1) \\ \vdots \\ A(x_n) \\ \end{pmatrix} = \begin{pmatrix} a_0 \\ a_1 \\ \vdots \\ a_n \\ \end{pmatrix} $$
If we fix the inputs $x_0, x_1, \ldots, x_n$ then commuting between a representation of a polynomial $A(x)$ via its coefficients and a representation via its values at these points is done via the following two matrix multiplications, with matrices made from constants, and thus, for fixed inputs $x_0, x_1, \ldots, x_n$, this switch between the two kinds of representations is done in linear time.
For $\mathbf{a} = \langle a_0, a_1, \ldots a_{n-1} \rangle$ a sequence of $n$ real or complex numbers, we can form the corresponding polynomial $P_A(x) = \sum_{j=0}^{n-1}a_jx^j$, and evaluate it at all complex roots of unity of order n, $$\forall \ 0 \leq k \leq n - 1, \quad P_A(\omega_n^k) = A_k = \sum_{j=0}^{n-1}a_j\omega_n^{jk}.$$
The DFT of a sequence $\mathbf{a}$ is a sequence $\mathbf{A}$ of the same length.
The IDFT of a sequence $\mathbf{A} = \langle A_0, A_1, \ldots, A_{n-1} \rangle$ is the sequence of values $\mathbf{a} = \langle a_0, a_1, \ldots, a_{n-1} \rangle = \langle \frac{P_a(1)}{n}, \frac{P_a(\omega_n^{-1})}{n}, \frac{P_a(\omega_n^{-2})}{n}, \ldots, \frac{P_a(\omega_n^{1-n})}{n} \rangle$.
We can show that IDFT(DFT($\mathbf{a}$)) = $\mathbf{a}$ and DFT(IDFT($\mathbf{A}$)) = $\mathbf{A}$.
Brute force computation of the DFT takes $\Theta(n^2)$, same for IDFT. The DFT of a sequence can be computed in $\Theta(n \lg n)$ using the FFT (as can be the IDFT).
For Divide and Conquer proof of correctness, mathematical induction is used.
Say that we want to prove that an algorithm $A(n)$ is correct.
The time complexity of a divide and conquer algorithm can be represented in the form $$T(n) = aT \left( \frac{n}{b} \right) + f(n)$$ where,
Using this information, we can find the runtime of most DAQ algorithms through one of the 3 cases where if $\epsilon > 0$ is a constant, then
If $f(n) = O(n^{\log_b a - \epsilon})$, then $T(n) = \Theta(n^{log_b a})$
Work per subproblem is dwarfed by number of subproblems
If $f(n) = \Theta(n^{\log_b a})$, then $T(n) = \Theta(n^{log_b a} \cdot \log n)$
Work per subproblem is comparable to number of subproblems
If $f(n) = \Omega(n^{\log_b a + \epsilon})$, then $T(n) = \Theta(f(n))$
Work per subproblem dominates number of subproblems
Greedy algorithms are often used for optimisation problems (i.e. you want to maximise or minimise some quantity according to a set of constraints) and make the optimal choice at each step to find the overall optimal way to solve the entire problem.
Common examples of Greedy Algorithms include
Given a set of $n$ tasks requests with start finish times,
$$R = \{(s_1, f_1), (s_2, f_2), ..., (s_n, f_n)\}$$
we want to find the maximum number of non overlapping tasks that we can complete.
An example of a greedy strategy that can solve this is Earliest Finish First
def eaf(tasks: list[tuple[int, int]]) -> list[tuple[int, int]]:
# Sort by earliest finish time
tasks.sort(key=lambda job: job[1])
schedule = []
prev_finish = 0
for (start, finish) in tasks:
# If the task doesn't conflict with the previous finish
if start > prev_finish:
# Add it to our schedule and update the finish time
schedule.append((start, finish))
prev_finish = finish
return schedule
The runtime of this algorithm is $O(n \log n)$ as it is dominated by the time required to sort the jobs, as the following loop runs in $O(n)$ time.
First, we prove the result is correct (i.e. there are no tasks conflicting).
This is true as we only add a new task to the schedule if it does not conflict with previous tasks.
Secondly, we prove that the result is optimal (i.e. it is indeed the maximum number of non conflicting tasks we can schedule).
Let $O = \{o_1, ..., o_k\}$ be the tasks of an optimal solution listed in increasing order of finish time, and $G = \{g_1,..., g_k \}$ by the tasks of the EFF solution.
Since $O$ is optimal, it must contain at least as many tasks as $G$, hence there must be the first index $j$, where these two schedules differ, such that
$$ \begin{aligned} O &= \{o_1, ..., o_{j - 1}, o_j, ... \} \\ G &= \{o_1, ..., o_{j - 1}, g_j, ... \}. \end{aligned} $$
Since EFF is correct and selects the earliest finish activity time, $g_j$ does not conflict with any earlier activity, and it finishes no later than $o_j$.
Now consider a "greedier" optimal solution $O'$ where we replace $o_j$ with $g_j$ in $O$, such that
$$O' = \{ o_1, ..., o_{j - 1}, g_j, o_{j + 1}, ... \}.$$
Clearly, $O'$ is correct since $g_j$ finishes no later than $x_j$ so therefore there are no conflicts. Therefore, this new schedule has the same number of activities as $O$, and so it is at least as good.
By repeating this process, we will eventually convert $O$ into $G$, without decreasing the number of tasks. Therefore $G$ is optimal.
Let $G = (V, E)$ be a connected undirected graph.
A Spanning Tree is a subgraph $T = (V, E_T)$ of $G$ such that
If $G$ is an edge weighted graph, then a MST is a spanning tree of minimum weight.
Let $T$ be the output of the algorithm
We show by contradiction that $T$ is connected:
Therefore $T$ is a spanning tree
We consider the case where all weights are distinct.
Let $T$ be the output of Kruskal's algorithm.
$G$ has an MST and any $T' \neq T$ is not an MST, so $T$ is an MST.
Prim's algorithm is another greedy algorithm to find the MST of a graph, however it's differences are listed as below.
Kruskal's algorithm | Prim's algorithm |
---|---|
Tree built always remains connected | Tree build usually remains disconnected |
Adds next cheapest edge | Adds next cheapest vertex |
Faster for sparse graphs | Faster for dense graphs |
Whilst greedy algorithms typically have a structure for correctness and optimality, the proof of runtime varies a lot depending on the data structures and algorithms used.
Note that greedy algorithms are typically used to maximise or minimise a quantity under a constraint.
We typically prove for correctness - that it satisfies the constraints of the problem (i.e. in task scheduling, we prove no two tasks overlap).
Secondly prove that the greedy method is achieves the most optimal solution (i.e. it does maximise or minimises a quantity). This can be done multiple ways
Greedy stays ahead: This works by showing that according to the measure of optimality, the greedy algorithm is at least as far ahead as the optimal solution during each iteration of the algorithm. It is typically done in 4 steps.
Exchange arguments: This works by showing you can iteratively transform any optimal solution into the result of the greedy algorithm without changing the cost of the optimal solution, thereby proving the greedy solution is optimal.
Dynamic programming is a method to find optimal solution to problems, from optimal solutions to subproblem. Because sets of subproblems to solve larger problems overlap, each subproblem can be calculated once, it's solutions are stored for next future calls.
Common examples of Dynamic Programming include
Whilst simple this problem showcases the concept of dynamic programming very well.
Create a function fib
where given $i$, find the $i^{th}$ fibonacci number. This can be defined with the following mathematical recurrence relation
$$ F_n = \begin{cases} 0 & \text{if } n = 0 \\ 1 & \text{if } n = 1 \\ F_{n - 1} + F_{n - 2} & \text{otherwise} \end{cases} $$
or with the following python code
# A function to calculate the nth fibonacci number
def fib(n: int) -> int:
if n == 0:
return 0
if n == 1:
return 1
return fib(n - 1) + fib(n - 2)
However, the runtime of this is $O(2^n)$, an exponential runtime, as each function call itself twice until it reaches the base case.
In top down dynamic programming, we use the process of memoisation (caching previous function call results) to reduce the runtime to $O(n)$, and space complexity of $O(n)$.
# Top down solution to find the nth fibonacci number
def fib(n: int) -> int:
# Create memoisation table and add base cases
memo = [-1] * n
memo[0] = 0
memo[1] = 1
return dp(n, memo)
def dp(n: int, memo: list[int]) -> int:
# If we have not calculated the value before
if memo[n] == -1:
memo[n] = dp(n - 1, memo) + dp(n - 2, memo)
# Now we can simply return the stored value in O(1)
return memo[n]
In bottom up DP, we write an iterative solution to compute the value of every subproblem to reduce the runtime to $O(n)$ and space complexity of $O(1)$.
# Bottom up solution to find the nth fibonacci number
def fib(n: int) -> int:
# Base Case
if n == 0:
return 0
if n == 1:
return 1
# Iterate up to the nth fibonacci number
a, b = 0, 1
for i in range(2, n + 1):
temp = a + b
a = b
b = temp
return b
Given a set of $n$ items with weights and values,
$$I = \{ (v_1, w_1), (v_2, w_2), ..., (v_n, w_n) \},$$
and a knapsack which has a weight capacity of $c$, choose a combination of items, maximising the value of the items while ensuring the total weight is less than $c$.
We can define the following recurrence relation
$$ F(i, c) = \begin{cases} 0 & \text{if } i = 0 \text{ or } c \leq 0 \\ \max(F(i - 1, c - w_i) + v_i, F(i - 1, c)) & \text{otherwise} \end{cases} $$
where
which has the base case that we return 0 if we have no items or we are out of capacity, else we start at an item, and then consider the maximum value we can get if:
The runtime of this is unfortunately $O(2^n)$ (since each subproblem makes two recursive calls and it has a maximum depth of $n$), however since there are recalculated subproblems (i.e. when $i$ and $c$ is the same), we can use dynamic programming.
In calculating the $n^{th}$ fibonacci value, there was only one parameter changed in repeated subproblems, therefore we only needed a $n$ sized 1D memoisation table.
However, for the integer knapsack problem, the value of each subproblem is dependent on both $i$ and $c$ so therefore we need a 2D $n \times c$ memoisation table.
# Bottom up solution for the integer knapsack problem
def knapsack(items: list[tuple[int, int]], c: int) -> int:
n = len(items)
memo = [[0] * (n + 1) for _ in range(c + 1)]
# Build table in bottom up manner
for i in range(n + 1):
for w in range(c + 1):
i_value, i_weight = items[i]
# If no more items or remaining capacity is 0
if i == 0 or w == 0:
memo[i][w] = 0
# If we can have the new item without going over capacity
elif items[i][1] <= w:
include_v = memo[i - 1][w - i_weight] + i_value
ignore_v = memo[i - 1][w]
# Choose the option that gives the most value
items[i][w] = max(include_v, ignore_v)
# If we cannot fit any more items
else:
memo[i][w] = memo[i - 1][w]
return memo[n][c]
Because at memo[i][j]
will always be the maximum value the bag can hold, after considering whether to include or not the first i
items, and when the bag has a maximum capacity of j
, then therefore memo[n][c]
will be the maximum value a bag of capacity c
can hold after considering all n
items.
The final time complexity is $O(nc)$ as there is a nested loop that loops $n \times c$ times, and each operation within the loop is $O(1)$ as they all only require $O(1)$ array access or arithmetic. Note that because the runtime is dependent on the numeric value of the input, and not just the length of the input size, the runtime of this algorithm is not polynomial, but pseudo-polynomial.
The space complexity is $O(nc)$ since there is a 2D $n \times c$ memoisation table to store the solutions of previous computations. It is important to keep in mind that despite the bottom up solution for fibonacci having a space complexity of $O(1)$, the space complexity here is the same as the top down solution as all previous computations need to be remembered. Similar to runtime, this also has a pseudo-polynomial space complexity.
Linear programming is an optimisation method to maximise or minimise a linear function (our objective) given linear constraints on variables.
This can be represented in mathematical model with
Component | Mathematical Representation |
---|---|
Variables | $x_j \geq 0 \text{ for } 1 \leq j \leq n$ |
Objective | $\text{maximise or minimise } \sum^{n}_{j = 1} c_j x_j$ |
Constraints | $\sum^{n}_{j = 1} a_{ij} x_j R_i b_i, \text{ for } 1 \leq i \leq m \text{ with } R_i \in \{\leq, =, \geq \}$ |
A feasible solution is a variable assignment satisfying all constraints.
An optimal solution is a feasible solution satisfying the objective.
A linear programming formulation can also be represented by the canonical form.
maximise
$$\textbf{c}^\text{T}\textbf{x}$$
subject to the constraints
$$ \begin{align*} A\textbf{x} & \leq \textbf{b} \\ \textbf{x} & \geq 0. \end{align*} $$
where
$$ \textbf{x} = \begin{pmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{pmatrix}, \textbf{c} = \begin{pmatrix} c_1 \\ c_2 \\ \vdots \\ c_n \end{pmatrix}, \textbf{b} = \begin{pmatrix} b_1 \\ b_2 \\ \vdots \\ b_m \end{pmatrix}, A = \begin{pmatrix} a_{1, 1} & a_{1, 2} & \ldots & a_{1, m} \\ a_{2, 1} & a_{2, 2} & \ldots & a_{2, m} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n, 1} & a_{n, 2} & \ldots & a_{n, m} \end{pmatrix}. $$
Solving a LP (Linear Programming) formulation is in P (has a polynomial runtime), whilst solving an ILP (Integer Linear Programming - similar to LP, except that all the variables must be integers) formulation is in NP.
In practice, given an optimisation problem, a person formulates the linear programming problem, which is then given to an LP solver which uses algorithms such as the simplex algorithm.
Reductions are the conversion of one problem to another. An efficient reduction from A to B may be used to show that B is at least as difficult as A.
A problem $A(n)$ is in class P, denoted by $A \in \mathbf{P}$, if there is an algorithm which solve it in polynomial time with regard to the size of the input.
A problem $A(n)$ is in class NP, denoted by $A \in \mathbf{NP}$, if there is an algorithm which can verify if a solution is correct or not in polynomial time with regard to the size of the input.
A problem is NP-Hard if any problem in NP is reducible to it (informally, a problem is NP-Hard if it it is at least as hard as the hardest problems in NP).
A problem is NP-Complete if it is in both NP, and NP-Hard (informally, a problem is NP-Complete if it is one of the hardest problems in NP, the "complete" meaning that it is able to simulate any problem in the same complexity class through reductions).
The SAT problem is: Given a propositional formula in the CNF form $C_1 \land C_2 \land ... \land C_n$ where each clause $C_i$ is a disjunction of propositional variables or their negations, i.e.
$$ (P_1 \lor \lnot P_2 \lor P_3 \lor \lnot P_5) \land (P_2 \lor P_3 \lor \lnot P_5 \lor \lnot P_6) \land (\lnot P_3 \lor \lnot P_4 \lor P_5) $$
Is there some assignment of boolean values to $P_1, P_2, ..., P_6$ which makes the formula true?
If each clause involves exactly 3 variables, then this problem is 3SAT.
Theorem: Every NP problem is polynomially reducible ot the SAT problem
This means that if the SAT problem is solvable in polynomial time, P = NP, as other NP problems can be reduced to SAT in polynomial time and then solved in polynomial time.
Karp's 21 problems are a set of problems which show that there is a many-one reduction from the SAT problem to other NP problems, therefore showing they are all NP-complete.
3GC AKA 3 Graph Colouring is a problem that asks given an undirected graph $G = (V, E)$, and a set of colours $C = \{r, g, b\}$, is there a function $f: V \rightarrow C$ such that if $(v, w) \in E$ then $f(v) \neq f(w)$.
Using the notation that $v_i$ is a proposition that is true
if vertex $v$ is the colour $i$, we can use the following rules to complete the reduction.
true
inside the clause. Since we take the negation of the proposition, both propositions are evaluated as false
, and their disjunction evaluates to false
.false
.false
.