matrix methods
notes from linear algebra
Vectors
Canonical Unit Vectors (standard basis vectors)
\( \mathbb{e}_i\) = where the i-th position is 1 and all other positions are 0. For example, in \( \mathbb{R}^3 \): \[ \mathbf{e}_1 = (1, 0, 0), \quad \mathbf{e}_2 = (0, 1, 0), \quad \mathbf{e}_3 = (0, 0, 1) \] Any vector \( \mathbf{v} \in \mathbb{R}^n \) can be expressed as: \[ \mathbf{v} = v_1 \mathbf{e}_1 + v_2 \mathbf{e}_2 + \cdots + v_n \mathbf{e}_n \]Inner Products (dot products)
\[ a \cdot b = a^T b \] Properties:1. \( b^T a = a^T b \)
2. \( (\gamma a)^T b = \gamma (a^T b)\)
scaling a then taking dot product with b is same as taking the dot product of a and b then scaling it
3. \( (a + b)^T c = a^T c + b^T c \)
the dot product of c with the sum of a and b is the same as the dot product of (a and c) plus (b and c)
Operations:
1. Picking out the i'th element: \( \quad e_{i}^T a = a_{i} \)
2. Sum of elements: \( \quad 1_n a = a_1 + a_2 + a_3 ... a_n \)
3. Sum of squares: \( \quad a^T a = a_{1}^2 + a_{2}^2 + a_{3}^2 ... a_{n}^2 \) \( \quad \quad a^T a = 0 \quad \text{iff} \quad a = 0 \)
Linear Functions
Satisfies superposition property (both homogeneity + additivity):1. Homogeneity: \( f(ax) = a f(x) \)
2. Additivity: \( f(x + y) = f(x) + f(y) \)
Any linear function can be represented as a dot product \(f(x) = a^T x \) and any dot product function is linear (for some fixed vector a)
\[ f(c_1 x + c_2 y) = \] \[ a^T (c_1 x + c_2 y) = a^T (c_1 x) + a^T (c_2 y) = c_1(a^T x) + c_2(a^T y) \] \[ = c_1 f(x) + c_2 f(y) \]
Affine Functions
Affine functions are linear functions \( f(x) = a^T x \) but offset by some b (linear always go through (0,0), affine functions need not): \[ f(x) = a^T x + b \]Taylor Approximation
A first order taylor series is a linear approximation of the function f.Using a nearby point z, we can approximate f(x) using: \[ \hat{f}(x) = f(z) + \Delta f(z)(x - z) \]
Norm and Distance
\[ ||x|| = \sqrt{x_{1}^2 + x_{2}^2 ...} = \sqrt{x^T x} \] \[ ||x||^2 = \textbf{sum of squares} = x_{1}^2 + x_{2}^2 ... = x^T x \] Properties:1. \( ||cx|| = |c|||x|| \)
2. \( ||x + y|| \leq ||x|| + ||y|| \)
3. \( ||x|| \geq 0 \)
4. \( ||x|| = 0 \quad \) iff \( \quad x = 0 \)
Mean square value: \( \frac{||x||^2}{n} \)
Root Mean square value (RMS): \( \frac{||x||}{\sqrt{n}} \quad \) (i.e., typical value of \( |x_i| \))
Block (stacked) Vectors:
\[ ||(a,b,c)||^2 = a^T a + b^T b + c^T c = ||a||^2 + ||b||^2 + ||c||^2 \]
Chebyshev's Inequality:
- Judges the size of a vector
- Most numbers in a vector can't be much bigger than its RMS
Suppose k elements in vector x is larger than some a. Then:
- Number of elements \( |x_i| > a \) is no more than \( ||x||^2 / a^2 \)
- The percentage of elements \( |x_i| \geq a RMS(x) \) is \( \leq \frac{1}{a^2} \)
\[ \frac{||x||^2}{a^2} \geq k \quad \text{or} \quad \frac{k}{n} \leq (\frac{RMS(x)}{a})^2 \]
Distance:
Distance between a and b is: \( || a - b || \)
Triangle Inequality: Sum of twox sides must be greater than or equal to third side
Cauchy Schwartz Inequality: \( |a \cdot b| \leq ||a||||b|| \)
Average of \( x = \frac{ x_1 + x_2 + ...}{n} = \frac{1^T x }{n} \)
Demeaned vector \( \tilde{x} = x - avg(x)1 \)
Standard deviation of \( x = rms(\tilde{x}) = \frac{||x - avg(x)1|| }{\sqrt{n}} \)
\( rms(x)^2 = avg(x)^2 + std(x)^2 \)
Standardized vector \( z = \frac{\tilde{x}}{std(x)}\)
Example in Finance:
x = time series of returns
avd(x) = average return
std(x) = volatility
Risk-Reward Plot: