3 Probability Basics

Things covered: (only an overview here, see slides for details)

3.1.I Essentials (Part I)

motivation: we need a framework to handle a continuous space of outcomes, like "picking a random point on the unit sphere"
expectations and law of large numbers
densities $\phi$ and distribution functions $F$, where $\phi = \frac{d}{dt}F(t)$
probability spaces $(\Omega, \Sigma, \Bbb P)$
union bound: $P\left(\bigcup_\ell B_\ell\right) \leq \sum_\ell P\left(B_\ell\right)$
moments $E(X^p)$ and absolute moments $E(|X|^p)$
- $L^p(\Omega, P)$ space
- "triangle inequality" $\left(E(|X+Y|^p)\right)^{\frac{1}{p}} \leq \left(E(|X|^p)\right)^{\frac{1}{p}} + \left(E(|Y|^p)\right)^{\frac{1}{p}}$
- Hoelder inequality $|E(XY)| \leq \left(E(|X|^p)\right)^{\frac{1}{p}} \left(E(|Y|^q)\right)^{\frac{1}{q}}$ if $\frac{1}{p} + \frac{1}{q} = 1$
- Cavalieri's formula: for $p > 0$, $$E(|X|^p) = p \int_0^\infty P(|X| \geq t) t^{p-1};dt$$
  - corollary: $E(X) = \int_0^\infty P(X \geq t);dt - \int_0^\infty P(X \leq -t);dt$
Lebesgue dominated convergence theorem: If all $X_n$ are dominated by some $Y$ in absolute value almost surely where $E(|Y|) < \infty$ and $X_n \to X$ almost surely, then $E(X_n) \to E(X)$
Fubini's Theorem: If $f: A \times B \to C$ is measurable and $\int_{A\times B} |f(x,y)| d(\nu \otimes \mu)(x,y) < \infty$, then $$\int_A \left(\int_B f(x,y) d ;\mu(y) \right);d\nu(x) = \int_B \left( \int_A f(x,y) ; d\nu(x)\right) ;d\mu(y)$$
Tail bounds
- tail $t \mapsto P(|X| \geq t)$
- Markov inequality $P(|X| \geq t) \leq \frac{E(|X|)}{t}$ for all $t > 0$
- as consequence, for $p, t > 0$, $P(|X| \geq t) \leq t^{-p} E(|X|^p)$
  - for $p=2$, this is called Chebyshev inequality
- also for $\theta > 0$, $P(X \geq t) \leq \exp(-\theta t) E(\exp(\theta X))$
  - the function $\theta \mapsto E(\exp(\theta X))$ is the Laplace transform, or moment generating function of $X$
Normal distribution $\mathcal N(\mu, \sigma)$

3.1.II Essentials (Part 2)

random vectors
- joint distribution function $F(t_1, \dots, t_n) = P(X_i \leq t_i ;\forall i)$
- joint density if $P(X \in D) = \int_D \phi(t_1,\dots,t_n);dt_1,\dots,t_n$ for all measurable $D$
- complex random vectors
independence
- $E(\cdot)$ and multiplication commute for independent random variables
- joint density factorizes for independent random variables
- applying measurable function maintains independence

%% #Lecture 12, 21.06. [[Foundations Section 3.1 part I printout.pdf]], [[Foundations Section 3.1 part II printout.pdf]] %%

for independent RVs, $\phi_{X+Y}(t) = (\phi_X \ast \phi_Y)(t) = \int_{-\infty}^\infty \phi_X(u) \phi_Y(t-u);du$
conditional expectations
- keep some RVs $X_1, \dots, X_k$ fixed
- pick smallest $\sigma$-alg. $\mathcal F$ w.r.t. which $X_1,\dots,X_k$ are measurable
- then: for any RV $Y$, there is a unique $Z$ measurable wrt. $\mathcal F$ s.t. $E(Y) = E(Z)$. We call $Z =: E(Y | X_1, \dots, X_k)$ the conditional expectation.
- relationship: $P(A \mid X_1, \dots, X_k) = E(\mathbf{1}_A \mid X_1, \dots, X_k)$ allows to compute conditional probabilities in the continuous setting
Fubini for expectations
- $E_X(E_{Y}(f(X, Y))) = E_Y(E_X(f(X, Y))) = E_{X,Y}(f(X,Y))$
- the slides write this more formally is by introducing $f_1(x) = E(f(x, Y))$ and $f_2(y) = E(f(X, y))$. Then $E(f_1(X)) = E(f_2(Y)) = E(f(X, Y))$
Gaussian vectors
- Gaussian distribution is rotation invariant
- sum of Gaussian variables is Gaussian
- multidimensional Gaussian density $$\psi(x) = \frac{1}{(2\pi)^\frac{n}{2} \sqrt{\det(\Sigma)}} \exp\left(-\tfrac{1}{2} \langle(x-\mu), \Sigma^{-1}(x-\mu)\rangle\right)$$
Jensens inequality: for $f: \Bbb R^n \to \Bbb R$ convex, $f(E(X)) \leq E(f(X))$
- for concave $f$, the inequality is reversed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3.1 Probability Basics - Essentials.md

3.1 Probability Basics - Essentials.md

3 Probability Basics

3.1.I Essentials (Part I)

3.1.II Essentials (Part 2)

Files

3.1 Probability Basics - Essentials.md

Latest commit

History

3.1 Probability Basics - Essentials.md

File metadata and controls

3 Probability Basics

3.1.I Essentials (Part I)

3.1.II Essentials (Part 2)