Skip to content

Commit 4080fcf

Browse files
authored
[wald_friedman] Fix minor typos across two wald_friedman lectures (#509)
1 parent 3a05315 commit 4080fcf

File tree

3 files changed

+33
-37
lines changed

3 files changed

+33
-37
lines changed

lectures/_static/quant-econ.bib

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1151,7 +1151,7 @@ @book{Kreps88
11511151
address = {Boulder, Colorado}
11521152
}
11531153

1154-
@book{Bertekas75,
1154+
@book{Bertsekas75,
11551155
author = {Dmitri Bertsekas},
11561156
title = {Dynamic Programming and Stochastic Control},
11571157
year = {1975},

lectures/wald_friedman.md

Lines changed: 22 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ jupytext:
44
extension: .md
55
format_name: myst
66
format_version: 0.13
7-
jupytext_version: 1.17.2
7+
jupytext_version: 1.17.1
88
kernelspec:
99
display_name: Python 3 (ipykernel)
1010
language: python
@@ -53,7 +53,7 @@ alternative **hypotheses**, key ideas in this lecture
5353

5454
- Type I and type II statistical errors
5555
- a type I error occurs when you reject a null hypothesis that is true
56-
- a type II error occures when you accept a null hypothesis that is false
56+
- a type II error occurs when you accept a null hypothesis that is false
5757
- The **power** of a frequentist statistical test
5858
- The **size** of a frequentist statistical test
5959
- The **critical region** of a statistical test
@@ -109,9 +109,9 @@ Let's listen to Milton Friedman tell us what happened
109109
> can be so regarded.
110110
111111
> When Allen Wallis was discussing such a problem with (Navy) Captain
112-
> Garret L. Schyler, the captain objected that such a test, to quote from
112+
> Garret L. Schuyler, the captain objected that such a test, to quote from
113113
> Allen's account, may prove wasteful. If a wise and seasoned ordnance
114-
> officer like Schyler were on the premises, he would see after the first
114+
> officer like Schuyler were on the premises, he would see after the first
115115
> few thousand or even few hundred [rounds] that the experiment need not
116116
> be completed either because the new method is obviously inferior or
117117
> because it is obviously superior beyond what was hoped for
@@ -128,7 +128,7 @@ That set Wald on a path that led him to create *Sequential Analysis* {cite}`W
128128
It is useful to begin by describing the theory underlying the test
129129
that the U.S. Navy told Captain G. S. Schuyler to use.
130130

131-
Captain Schulyer's doubts motivated him to tell Milton Friedman and Allan Wallis his conjecture
131+
Captain Schuyler's doubts motivated him to tell Milton Friedman and Allen Wallis his conjecture
132132
that superior practical procedures existed.
133133

134134
Evidently, the Navy had told Captain Schuyler to use what was then a state-of-the-art
@@ -256,13 +256,13 @@ Wald summarizes Neyman and Pearson's setup as follows:
256256
> will have the required size $\alpha$.
257257
258258
Wald goes on to discuss Neyman and Pearson's concept of *uniformly most
259-
powerful* test.
259+
powerful* test.
260260

261261
Here is how Wald introduces the notion of a sequential test
262262

263263
> A rule is given for making one of the following three decisions at any stage of
264-
> the experiment (at the m th trial for each integral value of m ): (1) to
265-
> accept the hypothesis H , (2) to reject the hypothesis H , (3) to
264+
> the experiment (at the $m$ th trial for each integral value of $m$): (1) to
265+
> accept the hypothesis $H$, (2) to reject the hypothesis $H$, (3) to
266266
> continue the experiment by making an additional observation. Thus, such
267267
> a test procedure is carried out sequentially. On the basis of the first
268268
> observation, one of the aforementioned decision is made. If the first or
@@ -271,8 +271,8 @@ Here is how Wald introduces the notion of a sequential test
271271
> the first two observations, one of the three decision is made. If the
272272
> third decision is made, a third trial is performed, and so on. The
273273
> process is continued until either the first or the second decisions is
274-
> made. The number n of observations required by such a test procedure is
275-
> a random variable, since the value of n depends on the outcome of the
274+
> made. The number $n$ of observations required by such a test procedure is
275+
> a random variable, since the value of $n$ depends on the outcome of the
276276
> observations.
277277
278278
## Wald's Sequential Formulation
@@ -334,7 +334,7 @@ random variables is also independently and identically distributed (IID).
334334

335335
But the observer does not know which of the two distributions generated the sequence.
336336

337-
For reasons explained in [Exchangeability and Bayesian Updating](https://python.quantecon.org/exchangeable.html), this means that the observer thinks that sequence is not IID.
337+
For reasons explained in [Exchangeability and Bayesian Updating](https://python.quantecon.org/exchangeable.html), this means that the observer thinks that the sequence is not IID.
338338

339339
Consequently, the observer has something to learn, namely, whether the observations are drawn from $f_0$ or from $f_1$.
340340

@@ -414,7 +414,7 @@ B \approx b(\alpha,\beta) & \equiv \frac{\beta}{1-\alpha}
414414
\end{aligned}
415415
$$ (eq:Waldrule)
416416
417-
For small values of $\alpha $ and $\beta$, Wald shows that approximation {eq}`eq:Waldrule` provides a good way to set $A$ and $B$.
417+
For small values of $\alpha$ and $\beta$, Wald shows that approximation {eq}`eq:Waldrule` provides a good way to set $A$ and $B$.
418418
419419
In particular, Wald constructs a mathematical argument that leads him to conclude that the use of approximation
420420
{eq}`eq:Waldrule` rather than the true functions $A (\alpha, \beta), B(\alpha,\beta)$ for setting $A$ and $B$
@@ -515,12 +515,12 @@ def sprt_single_run(a0, b0, a1, b1, logA, logB, true_f0, seed):
515515
return n, True # Accept H0
516516
517517
@njit(parallel=True)
518-
def run_sprt_simulation(a0, b0, a1, b1, alpha, βs, N, seed):
518+
def run_sprt_simulation(a0, b0, a1, b1, α, β, N, seed):
519519
"""SPRT simulation described by the algorithm."""
520520
521521
# Calculate thresholds
522-
A = (1 - βs) / alpha
523-
B = βs / (1 - alpha)
522+
A = (1 - β) / α
523+
B = β / (1 - α)
524524
logA = np.log(A)
525525
logB = np.log(B)
526526
@@ -690,9 +690,7 @@ results_3 = run_sprt(params_3)
690690
```
691691
692692
```{code-cell} ipython3
693-
---
694-
tags: [hide-input]
695-
---
693+
:tags: [hide-input]
696694
697695
def plot_sprt_results(results, params, title=""):
698696
"""Plot SPRT simulation results."""
@@ -781,7 +779,7 @@ When two distributions are "close", it should takes longer to decide which one
781779
782780
It is tempting to link this pattern to our discussion of [Kullback–Leibler divergence](rel_entropy) in {doc}`likelihood_ratio_process`.
783781
784-
While, KL divergence is larger when two distribution differ more, KL divergence is not symmetric, meaning that the KL divergence of distribution $f$ from distribution $g$ is not necessarily equal to the KL
782+
While, KL divergence is larger when two distributions differ more, KL divergence is not symmetric, meaning that the KL divergence of distribution $f$ from distribution $g$ is not necessarily equal to the KL
785783
divergence of $g$ from $f$.
786784
787785
If we want a symmetric measure of divergence that actually a metric, we can instead use [Jensen-Shannon distance](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.jensenshannon.html).
@@ -793,7 +791,7 @@ We shall compute Jensen-Shannon distance and plot it against the average stoppi
793791
```{code-cell} ipython3
794792
def kl_div(h, f):
795793
"""KL divergence"""
796-
integrand = lambda w: f(w) * np.log(f(w) / h(w))
794+
integrand = lambda w: h(w) * np.log(h(w) / f(w))
797795
val, _ = quad(integrand, 0, 1)
798796
return val
799797
@@ -896,7 +894,7 @@ plt.tight_layout()
896894
plt.show()
897895
```
898896
899-
Again, we find that the stopping time is shorter when the distributions are more separated
897+
Again, we find that the stopping time is shorter when the distributions are more separated, as
900898
measured by Jensen-Shannon distance.
901899
902900
Let's visualize individual likelihood ratio processes to see how they evolve toward the decision boundaries.
@@ -981,12 +979,12 @@ In the code below, we adjust Wald's rule by adjusting the thresholds $A$ and $B
981979
982980
```{code-cell} ipython3
983981
@njit(parallel=True)
984-
def run_adjusted_thresholds(a0, b0, a1, b1, alpha, βs, N, seed, A_f, B_f):
982+
def run_adjusted_thresholds(a0, b0, a1, b1, α, β, N, seed, A_f, B_f):
985983
"""SPRT simulation with adjusted thresholds."""
986984
987985
# Calculate original thresholds
988-
A_original = (1 - βs) / alpha
989-
B_original = βs / (1 - alpha)
986+
A_original = (1 - β) / α
987+
B_original = β / (1 - α)
990988
991989
# Apply adjustment factors
992990
A_adj = A_original * A_f

lectures/wald_friedman_2.md

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ jupytext:
44
extension: .md
55
format_name: myst
66
format_version: 0.13
7-
jupytext_version: 1.17.2
7+
jupytext_version: 1.17.1
88
kernelspec:
99
display_name: Python 3 (ipykernel)
1010
language: python
@@ -52,16 +52,16 @@ A frequentist statistician studies the distribution of that statistic under that
5252
* when the distribution is a member of a set of parameterized probability distributions, his hypothesis takes the form of a particular parameter vector.
5353
* this is what we mean when we say that the frequentist statistician 'conditions on the parameters'
5454
* he regards the parameters as fixed numbers that are known to nature, but not to him.
55-
* the statistician copes with his ignorance of thoe parameters by constructing type I and type II errors associated with frequentist hypothesis testing.
55+
* the statistician copes with his ignorance of those parameters by constructing type I and type II errors associated with frequentist hypothesis testing.
5656

5757
In this lecture, we reformulate Friedman and Wald's problem by transforming our point of view from the 'objective' frequentist perspective of {doc}`the lecture on Wald's sequential analysis<wald_friedman>` to an explicitly 'subjective' perspective taken by a Bayesian decision maker who regards parameters not as fixed numbers but as (hidden) random variables that are jointly distributed with the random variables that he can observe by sampling from that joint distribution.
5858

5959
To form that joint distribution, the Bayesian statistician supplements the conditional distributions used by the frequentist statistician with
60-
a prior probability distribution over the parameters that representive his personal, subjective opinion about those them.
60+
a prior probability distribution over the parameters that represents his personal, subjective opinion about them.
6161

6262
That lets the Bayesian statistician calculate the joint distribution that he requires to calculate the conditional distributions that he wants.
6363

64-
To proceed in the way, we endow our decision maker with
64+
To proceed in this way, we endow our decision maker with
6565

6666
- an initial prior subjective probability $\pi_{-1} \in (0,1)$ that nature uses to generate $\{z_k\}$ as a sequence of i.i.d. draws from $f_1$ rather than $f_0$.
6767
- faith in Bayes' law as a way to revise his subjective beliefs as observations on $\{z_k\}$ sequence arrive.
@@ -97,12 +97,10 @@ from numba.experimental import jitclass
9797
from math import gamma
9898
```
9999

100-
101-
102100
## A Dynamic Programming Approach
103101

104102
The following presentation of the problem closely follows Dmitri
105-
Berskekas's treatment in **Dynamic Programming and Stochastic Control** {cite}`Bertekas75`.
103+
Bertsekas's treatment in **Dynamic Programming and Stochastic Control** {cite}`Bertsekas75`.
106104

107105
A decision-maker can observe a sequence of draws of a random variable $z$.
108106

@@ -134,7 +132,7 @@ $$
134132
$$
135133

136134
```{note}
137-
In {cite:t}`Bertekas75`, the belief is associated with the distribution $f_0$, but here
135+
In {cite:t}`Bertsekas75`, the belief is associated with the distribution $f_0$, but here
138136
we associate the belief with the distribution $f_1$ to match the discussions in {doc}`the lecture on Wald's sequential analysis<wald_friedman>`.
139137
```
140138

@@ -194,7 +192,7 @@ axes[0].plot(grid, f1(grid), lw=2, label="$f_1$")
194192
axes[1].set_title("Mixtures")
195193
for π in 0.25, 0.5, 0.75:
196194
y = (1 - π) * f0(grid) + π * f1(grid)
197-
axes[1].plot(y, lw=2, label=fr"$\pi_k$ = {π}")
195+
axes[1].plot(grid, y, lw=2, label=fr"$\pi_k$ = {π}")
198196
199197
for ax in axes:
200198
ax.legend()
@@ -756,11 +754,11 @@ and investigate
756754
as we increase the number of grid points in the piecewise linear approximation.
757755
* effects of different settings for the cost parameters $L_0, L_1, c$, the
758756
parameters of two beta distributions $f_0$ and $f_1$, and the number
759-
of points and linear functions $m$ to use in the piece-wise continuous approximation to the value function.
757+
of points and linear functions $m$ to use in the piecewise continuous approximation to the value function.
760758
* various simulations from $f_0$ and associated distributions of waiting times to making a decision.
761759
* associated histograms of correct and incorrect decisions.
762760

763761

764762
[^f1]: The decision maker acts as if he believes that the sequence of random variables
765-
$[z_{0}, z_{1}, \ldots]$ is *exchangeable*. See [Exchangeability and Bayesian Updating](https://python.quantecon.org/exchangeable.html) and
766-
{cite}`Kreps88` chapter 11, for discussions of exchangeability.
763+
$[z_{0}, z_{1}, \ldots]$ is *exchangeable*. See [Exchangeability and Bayesian Updating](https://python.quantecon.org/exchangeable.html) and
764+
{cite}`Kreps88` chapter 11, for discussions of exchangeability.

0 commit comments

Comments
 (0)