Skip to content

Latest commit

 

History

History
53 lines (49 loc) · 3.72 KB

3.1 Probability Basics - Essentials.md

File metadata and controls

53 lines (49 loc) · 3.72 KB

3 Probability Basics

Things covered: (only an overview here, see slides for details)

3.1.I Essentials (Part I)

  • motivation: we need a framework to handle a continuous space of outcomes, like "picking a random point on the unit sphere"
  • expectations and law of large numbers
  • densities $\phi$ and distribution functions $F$, where $\phi = \frac{d}{dt}F(t)$
  • probability spaces $(\Omega, \Sigma, \Bbb P)$
  • union bound: $P\left(\bigcup_\ell B_\ell\right) \leq \sum_\ell P\left(B_\ell\right)$
  • moments $E(X^p)$ and absolute moments $E(|X|^p)$
    • $L^p(\Omega, P)$ space
    • "triangle inequality" $\left(E(|X+Y|^p)\right)^{\frac{1}{p}} \leq \left(E(|X|^p)\right)^{\frac{1}{p}} + \left(E(|Y|^p)\right)^{\frac{1}{p}}$
    • Hoelder inequality $|E(XY)| \leq \left(E(|X|^p)\right)^{\frac{1}{p}} \left(E(|Y|^q)\right)^{\frac{1}{q}}$ if $\frac{1}{p} + \frac{1}{q} = 1$
    • Cavalieri's formula: for $p > 0$, $$E(|X|^p) = p \int_0^\infty P(|X| \geq t) t^{p-1};dt$$
      • corollary: $E(X) = \int_0^\infty P(X \geq t);dt - \int_0^\infty P(X \leq -t);dt$
  • Lebesgue dominated convergence theorem: If all $X_n$ are dominated by some $Y$ in absolute value almost surely where $E(|Y|) < \infty$ and $X_n \to X$ almost surely, then $E(X_n) \to E(X)$
  • Fubini's Theorem: If $f: A \times B \to C$ is measurable and $\int_{A\times B} |f(x,y)| d(\nu \otimes \mu)(x,y) < \infty$, then $$\int_A \left(\int_B f(x,y) d ;\mu(y) \right);d\nu(x) = \int_B \left( \int_A f(x,y) ; d\nu(x)\right) ;d\mu(y)$$
  • Tail bounds
    • tail $t \mapsto P(|X| \geq t)$
    • Markov inequality $P(|X| \geq t) \leq \frac{E(|X|)}{t}$ for all $t > 0$
    • as consequence, for $p, t > 0$, $P(|X| \geq t) \leq t^{-p} E(|X|^p)$
      • for $p=2$, this is called Chebyshev inequality
    • also for $\theta > 0$, $P(X \geq t) \leq \exp(-\theta t) E(\exp(\theta X))$
      • the function $\theta \mapsto E(\exp(\theta X))$ is the Laplace transform, or moment generating function of $X$
  • Normal distribution $\mathcal N(\mu, \sigma)$

3.1.II Essentials (Part 2)

  • random vectors
    • joint distribution function $F(t_1, \dots, t_n) = P(X_i \leq t_i ;\forall i)$
    • joint density if $P(X \in D) = \int_D \phi(t_1,\dots,t_n);dt_1,\dots,t_n$ for all measurable $D$
    • complex random vectors
  • independence
    • $E(\cdot)$ and multiplication commute for independent random variables
    • joint density factorizes for independent random variables
    • applying measurable function maintains independence

%% #Lecture 12, 21.06. [[Foundations Section 3.1 part I printout.pdf]], [[Foundations Section 3.1 part II printout.pdf]] %%

  • for independent RVs, $\phi_{X+Y}(t) = (\phi_X \ast \phi_Y)(t) = \int_{-\infty}^\infty \phi_X(u) \phi_Y(t-u);du$
  • conditional expectations
    • keep some RVs $X_1, \dots, X_k$ fixed
    • pick smallest $\sigma$-alg. $\mathcal F$ w.r.t. which $X_1,\dots,X_k$ are measurable
    • then: for any RV $Y$, there is a unique $Z$ measurable wrt. $\mathcal F$ s.t. $E(Y) = E(Z)$. We call $Z =: E(Y | X_1, \dots, X_k)$ the conditional expectation.
    • relationship: $P(A \mid X_1, \dots, X_k) = E(\mathbf{1}_A \mid X_1, \dots, X_k)$ allows to compute conditional probabilities in the continuous setting
  • Fubini for expectations
    • $E_X(E_{Y}(f(X, Y))) = E_Y(E_X(f(X, Y))) = E_{X,Y}(f(X,Y))$
    • the slides write this more formally is by introducing $f_1(x) = E(f(x, Y))$ and $f_2(y) = E(f(X, y))$. Then $E(f_1(X)) = E(f_2(Y)) = E(f(X, Y))$
  • Gaussian vectors
    • Gaussian distribution is rotation invariant
    • sum of Gaussian variables is Gaussian
    • multidimensional Gaussian density $$\psi(x) = \frac{1}{(2\pi)^\frac{n}{2} \sqrt{\det(\Sigma)}} \exp\left(-\tfrac{1}{2} \langle(x-\mu), \Sigma^{-1}(x-\mu)\rangle\right)$$
  • Jensens inequality: for $f: \Bbb R^n \to \Bbb R$ convex, $f(E(X)) \leq E(f(X))$
    • for concave $f$, the inequality is reversed