Refreshers

Linear Algebra

Eigensystems

A square matrix \(X\in\mathbb{C}^{n\times n}\) had an eigenvalue \(\lambda\in\mathbb{C}\) with associated eigenvector \(\boldsymbol{v}\in\mathbb{C}^n\) if \(X\boldsymbol{v}=\lambda\boldsymbol{v}\).

Each such matrix has \(n\) eigenvalues, possibly with duplication, and the set of all eigenvalues is called the spectrum. The largest eigenvalue is denoted \(\lambda_{\max}(X)\) and its magnitude is referred to as the spectral radius. The characteristic polynomial of a matrix \(X\) is given by \(p(x)=\det(A-xI)\). The eigenvalues of \(X\) are the roots of the characteristic polynomial. It follows that the determinant of a matrix is the product of its eigenvalues: \(\det(X)=\prod_{i=1}^n\lambda_i\).

Diagonalisation

A matrix \(X\) is diagonalisable if it can be written in the form \(X=V\Lambda V^{-1}\), where \(\Lambda\) is a diagonal matrix. In this case the entries of \(\Lambda\) are the eigenvalues of \(X\) and the columns of \(V\) are the corresponding eigenvectors.

Every real symmetrix matrix (i.e. \(X_{ij}=X_{ji}\in\mathbb{R}\)) is diagonalisable, and all its eigenvalues are real.

Inversion

The inverse of a square matrix \(X\) (written \(X^{-1}\)) is the matrix with the property that \(XX^{-1}=X^{-1}X=I\). A matrix \(X\) has an inverse if and only if \(\det(X)\neq 0\), or equivalently if zero is not an eigenvalue.

Methods for finding the inverse:

  • If the eigenvalues and eigenvectors are known then \(X=V\Lambda V^{-1}\Rightarrow X^{-1}=V\Lambda^{-1} V^{-1}\)

  • If \(a,b,c,d\) are numbers such that \(ad\neq bc\) then \[\left(\begin{array}{cc}a&b\\c&d\end{array}\right)^{-1}=\frac{1}{ad-bc}\left(\begin{array}{cc}d&-b\\-c&a\end{array}\right).\]

  • If \(A,B,C,D\) are matrices such that \(A\), \(D\), \(A-BD^{-1}C\) and \(D-CA^{-1}B\) are all invertible, then \[\left(\begin{array}{cc}A&B\\C&D\end{array}\right)^{-1}=\left(\begin{array}{cc}(A-BD^{-1}C)^{-1}&-(A-BD^{-1}C)^{-1}BD^{-1}\\-(D-CA^{-1}B)^{-1}CA^{-1}&(D-CA^{-1}B)^{-1}\end{array}\right).\] Note that \(B\) and \(C\) need not be square.

  • Gauss-Jordan algorithm

Computational cost

The computational costs of matrix calculations are measured in terms of the number of elementary operations needed as a function of the size of the matrix. If an algorithm has cost \(C=\mathcal{O}(f(n))\), it means that the number of operations needed as \(n\to\infty\) is less than \(cf(n)\) for some constant \(c>0\).

With naive algorithms on \(n\)-vectors and real symmetric \(n\times n\) matrices we have:

  • vector-scalar multiplication is \(\mathcal{O}(n)\)

  • vector-vector inner product (e.g. for norm calculation) is \(\mathcal{O}(n)\)

  • matrix-vector multiplication is \(\mathcal{O}(n^2)\)

  • matrix-matrix multiplication is \(\mathcal{O}(n^3)\)

  • matrix diagonalisation with the Jacobi algorithm is \(\mathcal{O}(n^3)\)

  • matrix inversion with Gauss-Jordan elimination is \(\mathcal{O}(n^3)\)

Probability

Probability space

A discrete probability space is determined by its sample space \(\Omega\), event space \(\mathcal{A}\subseteq\mathcal{P}(\Omega)\) and probability measure \(\mathbb{P}:\mathcal{A}\to[0,1]\). The elements of \(\Omega\) are called elementary events.

In this course I will only deal with very simple and intuitive probability spaces and will never need to be explicit about \(\Omega\), \(\mathcal{A}\) and \(\mathbb{P}\). For example, if I say > “let \(u\) and \(v\) be distinct vertices chosen uniformly at random from \(V\)

then we have \(\Omega=\{(u,v)\in V^2 | u\neq v\}\). In this example an elementary event is a particular choice of \(u\) and \(v\).

The word “uniform” means all elementary events are equally likely. In the above example their probability would be \(1/|V|(|V|-1)\), this is the number of ways to assign \(u\) and \(v\) to distinct elements of \(V\). A more general event could be something like “\(u\) and \(v\) are neighbours in the graph \(G=(V,E)\).” The probability of this event is \[\mathbb{P}(u\sim v)=\frac{2|E|}{|V|(|V|-1)}.\]

Conditional probability and independence

For events \(A\) and \(B\), the probability of \(A\) conditioned on \(B\) is \[\mathbb{P}(A|B)=\frac{\mathbb{P}(A \text{ and } B)}{\mathbb{P}(B)}.\] In words, this is the chance of \(A\) happening if \(B\) is certain.

Events \(A\) and \(B\) are independent if \(\mathbb{P}(A \text{ and } B)=\mathbb{P}(A)\mathbb{P}(B)\). Or equivalently if \(\mathbb{P}(A | B)=\mathbb{P}(A)\), that is, knowledge of \(B\) tells us nothing about \(A\).

Random variables, distributions and expectations

A random variable is a function \(Y:\Omega\to\mathcal{Y}\) where \(\mathcal{Y}\) is some set. For example, if \(v\) is a vertex chosen uniformly at random from a graph \(G=(V,E)\), then its degree \(d(v)\) is a random variable in \(\mathbb{N}_0\).

A probability distribution on a discrete set \(\mathcal{Y}\) is a function \(p:\mathcal{Y}\to[0,1]\) such that \(\sum_{y\in\mathcal{Y}}p(y)=1\).

A random variable \(Y\) has a distribution on its codomain \(\mathcal{Y}\) that is inherited from the underlying probability measure \(\mathbb{P}\). Specifically, \[p(y)=\mathbb{P}(Y=y)=\mathbb{P}(\{\omega\in\Omega | Y(\omega)=y\}). \] For example, if \(v\) is a randomly chosen vertex in a graph, then \(d(v)\) has distribution \(p_d:=\frac{1}{|V|}\sum_u \delta_{d(u),d}\).

If \(\mathcal{Y}\subseteq \mathbb{R}\) then we define the expectation of random variable \(Y\) as \(\mathbb{E}(Y)=\sum_{y\in\mathcal{Y}}yp(y)\). Intuitively, this is the “average” or “mean” value of \(Y\).

Matlab

Here is a guide to getting started with Matlab: Intro to Matlab

Once you are able to launch Matlab and run scripts, my advice is to experiment with the sample scripts appearing in problem sheets and examples.

If there is function you want to learn more about, type “help” followed by the name of the function, and you will see a summary of its purpose and use cases.