In some sense, the strong law of large numbers is the bridge from probability to logic. It is the primary way to generate certainties from probabilities.

The basic theorem is as follows. We will prove this and some other variants.


Theorem. Let $x_1, x_2, \dots$ be an identical and pairwise-independent sequence of random variables with mean $\mathbb E[x_i] = \mu$. Define $s_n := \frac1n \sum_{i=1}^n x_i$ to be the average of the first $n$ samples. Then:

$$P(s_n \to \mu) = 1$$

The optimal proof depends on what extra assumptions we add. We will prove the theorem for three different cases:

  1. The case where the fourth moment is controlled (standard, this gives a very easy proof, and yields a law for triangular arrays).
  2. The case where the second moment is controlled (this is a known result which allows weakening the independence to orthogonality. my proof is nonstandard).
  3. The general case (standard, basically following Etemadi’s proof).

For the moment-controlled proofs we will assume without loss of generality that $\mu = 0$.


General tool

In general, there exists a series whose convergence implies almost-sure convergence.

Theorem. If $z_n$ is any sequence of random variables, then a sufficient condition for almost-sure convergence is

$$(\forall \varepsilon > 0) \left(\sum_{n=1}^\infty P(|z_n| \leq \varepsilon) < \infty\right) \implies (P(z_n \to 0) = 1)$$

Proof.

\begin{align*} P(z_n \not\to 0) &= P((\exists \varepsilon > 0)(\forall N \geq 1)(\exists n \geq N): |z_n| \geq \varepsilon) \\ &= P((\exists k \geq 1)(\forall N \geq 1)(\exists n \geq N): |z_n| \geq 1/k) \\ &= P\left(\bigcup_{k \geq 1} \left\{\sum_{n=1}^\infty 1\{|z_n| \geq 1/k\} = \infty\right\}\right) \\ &= \lim_{k \to \infty} P\left(\sum_{n=1}^\infty 1\{|z_n| \geq 1/k\} = \infty\right) \\ &= \lim_{\varepsilon \to 0} P\left(\sum_{n=1}^\infty 1\{|z_n| \geq \varepsilon\} = \infty\right) \\ \end{align*}

where the limit comes out due to the continuity of probability measure wrt increasing countable unions.

Intuition: If the overall probability of nonconvergence is nonzero, then there must exist some specific fixed $\varepsilon > 0$ such that with nonzero probability, $|z_n| \geq \varepsilon$ infinitely often. Thus the problem reduces to showing that this has probability zero for every $\varepsilon > 0$

The following lemma relates this event to the series of probabilities in the theorem statement.


Lemma. Let $E_1, E_2, \dots$ be any sequence of events. Then

$$\sum_{n=1}^\infty P(E_n) < \infty \implies P\left(\sum_{n=1}^\infty 1_{E_n} = \infty\right) = 0.$$

Proof.

\begin{align*} P\left(\sum_{n=1}^\infty 1_{E_n} = \infty\right) &= P\left((\forall N \geq 1)(\exists n \geq N) : 1_{E_n} = 1\right) \\ &= P\left(\bigcap_{N \geq 1} \bigcup_{n \geq N} E_n\right) \\ &= \lim_{N \to \infty}P\left(\bigcup_{n \geq N} E_n\right) \\ &\leq \lim_{N \to \infty}\sum_{n \geq N} P(E_n) \\ &= \lim_{N \to \infty}\left(\left(\sum_{n=1}^\infty P(E_n)\right) - \left(\sum_{n=1}^{N-1} P(E_n)\right)\right) \\ &= \left(\sum_{n=1}^\infty P(E_n)\right) - \lim_{N \to \infty}\left(\sum_{n=1}^{N-1} P(E_n)\right) \\ &= \left(\sum_{n=1}^\infty P(E_n)\right) - \left(\sum_{n=1}^\infty P(E_n)\right) \\ &= 0. \end{align*}

QED.


Combining this lemma with the earlier result yields the theorem.

QED.

Remark: This condition is sufficient but not necessary. For example, if we have $z_0 \sim \mathrm{Exponential}(1)$ (so that $P(z_0 \geq C) = e^{-C}$), and $z_n = z_0/\log n$, then $P(z_n \to 0) = 1$, despite the fact that $P(|z_n| \geq \varepsilon) = e^{-\varepsilon\log n} = n^{-\varepsilon}$ does not yield a convergent series for $\varepsilon \leq 1$.


Strong law for bounded fourth moment

The following theorem is strictly weaker than the full SLLN, but has a very easy proof.

Theorem. Assume that $(x_i)$ is identically distributed such that $\mathbb E[x_i] = 0$, $\mathbb E[x_i^4] < \infty$, and $\mathbb E[x_ix_jx_kx_l] = 0$ whenever $i \not\in \\{j, k, l\\}$. Then:

$$P(s_n \to \mathbb 0) = 1$$

Proof.

We have:

\begin{align*} \mathbb E[s_n^4] &= \frac{1}{n^4}\sum_{i=1}^n \sum_{j=1}^n \sum_{k=1}^n \sum_{l=1}^n \mathbb E[x_ix_jx_kx_l] \\ &= \frac{1}{n^4} \sum_{i=1}^n \mathbb E[x_i^4] + \frac{1}{n^4} \sum_{(i, j) \in \binom{n}{2}}\binom{4}{2}\mathbb E[x_i^2x_j^2] \\ &\leq \frac{1}{n^4} \sum_{i=1}^n \mathbb E[x_i^4] + \frac{1}{n^4} \sum_{(i, j) \in \binom{n}{2}}\binom{4}{2}\mathbb E[x_i^4]^{1/2}\mathbb E[x_j^4]^{1/2} \\ &= \frac{1}{n^4}n \mathbb E[x_1^4] + \frac{1}{n^4} \binom{n}{2}\binom{4}{2}\mathbb E[x_1^4] \\ &= (1/n^3)\mathbb E[x_1^4](1 + 3(n-1)) \\ &< (3/n^2)\mathbb E[x_1^4] \end{align*}

where the third step is by Cauchy-Schwarz. Therefore:

\begin{align*} \sum_{n=1}^\infty P(|s_n| \geq \varepsilon) &= \sum_{n=1}^\infty \mathbb E[1\{|s_n| \geq \varepsilon\}] \\ &= \sum_{n=1}^\infty (1/\varepsilon^4)\mathbb E[\varepsilon^4 1\{s_n^4 \geq \varepsilon^4\}] \\ &\leq \frac{1}{\varepsilon^4}\sum_{n=1}^\infty \mathbb E[s_n^4] \\ &< \frac{1}{\varepsilon^4}\sum_{n=1}^\infty (3/n^2) \mathbb E[x_1^4] \\ &= \frac{\pi^2}{2\varepsilon^4}\mathbb E[x_1^4] \\ &< \infty. \end{align*}

Which is finite, so the partial sums, which bound the probability of $\\{(\exists n \geq N): |s_n| \geq \varepsilon\\}$, converge to zero.

(Note: We will henceforth omit the intermediate steps which allowed us to replace $P(|s_n| \geq \varepsilon)$ with $\frac{1}{\varepsilon^k} \mathbb E[|s_n|^k]$).


Strong law for triangular arrays with bounded fourth moment

Theorem. Let $(x_{k,i})_{k \geq 1, 1 \leq i \leq k}$ be a triangular array of identically distributed random variables, such that each row $(x_{k,1}, \dots, x_{k,k})$ meets the criterion of the previous theorem. Define $s_n := \frac1n \sum_{i=1}^n x_{n,i}$. Then:

$$P(s_n \to \mathbb 0) = 1$$

Proof.

In the proof of the previous theorem we showed that $\mathbb E[s_n^4] < (3/n^2)\mathbb E[x_{n,1}^4]$. Therefore $\sum_{n=1}^\infty P(|s_n| \geq \varepsilon) < \infty$, using again the same logic as in the previous proof.

Remark: a slightly weaker version of this (assuming full independence rather than the decorrelation criterion in the previous theorem) can be found in Terry Tao’s notes. The proof is the same.


Strong law for orthogonal identical variables with finite variance

The following theorem requires the additional assumption of bounded variance, but allows us to drop the assumption of independence. It is therefore partially stronger than the normal full SLLN.

Theorem. Assume that $(x_i)$ is an identically distributed sequence of random variables, which have finite variance and are pairwise-uncorrelated. Then:

$$P(s_n \to \mu) = 1$$

Proof.

Intuition: The rate of the main sequence is now a bit too slow (it would yield a harmonic series). But we can control the quadratic subsequence $s_{m^2}$, and then prove that a slight rescaling of the main sequence stays close to it.

Without loss of generality assume that $\mathbb E[x_i] = 0$. The quadratic subsequence converges to zero because:

\begin{align*} \sum_{m=1}^\infty P(|s_{m^2}| \geq \varepsilon) &\leq \frac{1}{\varepsilon^2}\sum_{m=1}^\infty \mathbb E[s_{m^2}^2] \\ &= \frac{1}{\varepsilon^2}\sum_{m=1}^\infty \frac{1}{m^4}\sum_{i=1}^{m^2} \sum_{j=1}^{m^2} \mathbb E[x_ix_j] \\ &= \frac{1}{\varepsilon^2}\sum_{m=1}^\infty \frac{1}{m^4}\sum_{i=1}^{m^2} \mathbb E[x_i^2] \\ &= \frac{\pi^2\sigma^2}{6\varepsilon^2} \\ &< \infty \end{align*}

After a slight rescaling, the main sequence stays close to its quadratic subsequence.

\begin{align*} \sum_{n=1}^\infty P\left(\left|\tfrac{n}{\lfloor \sqrt{n}\rfloor^2}s_n - s_{\lfloor \sqrt{n}\rfloor^2}\right| \geq \varepsilon\right) &= \sum_{m=1}^\infty \sum_{n=m^2}^{(m+1)^2-1} P\left(|\tfrac{n}{m^2}s_n - s_{m^2}| \geq \varepsilon\right) \\ &= \sum_{m=1}^\infty \sum_{n=m^2}^{(m+1)^2-1} P\left(\left|\frac{1}{m^2}\sum_{k=m^2+1}^n x_k\right| \geq \varepsilon\right) \\ &\leq \sum_{m=1}^\infty \sum_{n=m^2}^{(m+1)^2-1} \frac{1}{\varepsilon^2}\mathbb E\left[\left(\frac{1}{m^2}\sum_{k=m^2+1}^n x_k\right)^2\right] \\ &= \sum_{m=1}^\infty \sum_{n=m^2}^{(m+1)^2-1} \frac{\sigma^2}{\varepsilon^2}\frac{n-m^2}{m^4} \\ &< \frac{\sigma^2}{\varepsilon^2}\sum_{m=1}^\infty \sum_{n=m^2}^{(m+1)^2-1} \frac{(m+1)^2-m^2}{m^4} \\ &= \frac{\sigma^2}{\varepsilon^2}\sum_{m=1}^\infty \frac{((m+1)^2 - m^2)^2}{m^4} \\ &= \frac{\sigma^2}{\varepsilon^2}\sum_{m=1}^\infty \frac{(2m+1)^2}{m^4} \\ &< \frac{\sigma^2}{\varepsilon^2}\sum_{m=1}^\infty \frac{(3m)^2}{m^4} \\ &= \frac{9\sigma^2}{\varepsilon^2}\sum_{m=1}^\infty \frac{1}{m^2} \\ &= \frac{3\pi^2\sigma^2}{2\varepsilon^2} \\ &< \infty. \end{align*}

Therefore, $P\left(\lim_{n \to \infty} \tfrac{n}{\lfloor \sqrt{n}\rfloor^2}s_n = \lim_{m \to \infty} s_{m^2}\right) = 1$. Finally, the main sequence stays near $\tfrac{n}{\lfloor \sqrt{n}\rfloor^2}s_n$ because:

\begin{align*} \lim \left|\tfrac{n}{\lfloor \sqrt{n}\rfloor^2}s_n - s_n\right| &= \lim\left|\tfrac{n}{\lfloor \sqrt{n}\rfloor^2}s_n\right|\left|1 - \tfrac{\lfloor \sqrt{n}\rfloor^2}{n}\right| \\ &= \lim\left|\tfrac{n}{\lfloor \sqrt{n}\rfloor^2}s_n\right|\lim\left|1 - \tfrac{\lfloor \sqrt{n}\rfloor^2}{n}\right| \\ &= 0 \cdot 0 \\ &= 0. \end{align*}

Therefore, $s_n$ surely remains close to $\tfrac{n}{\lfloor \sqrt{n}\rfloor^2}s_n$, which almost-surely remains close to $s_{\lfloor \sqrt{n}\rfloor^2}$, which almost-surely converges to the mean.


Does there exist a strong law for triangular arrays of identical orthogonal variables?

Nope, via a construction using the example provided in https://terrytao.wordpress.com/2010/01/03/254a-notes-1-concentration-of-measure/.


Strong law for IID (Etemadi 1981)

Theorem. Assume $(x_i)$ is identical and pairwise independent. Then:

$$P(s_n \to \mu) = 1$$

Proof.


Outline:

  1. Reduce to positive case.
  2. Reduce to the linear truncation thereof.
  3. Reduce to an exponential subsequence thereof.
  4. Prove the SLLN for all exponential subsequences of the linear truncation of a positive sequence.

If $\mathbb E[x_i] < \infty$, then certainly $\mathbb E[|x_i|] < \infty$. Therefore if we can prove the theorem for the nonnegative case, then the general case follows via

\begin{align*} \lim_{n \to \infty} \frac1n \sum_{i=1}^n x_i &= \lim_{n \to \infty} \frac1n \sum_{i=1}^n (1\{x_i \geq 0\}|x_i|) - (1\{x_i \leq 0\}|x_i|) \\ &= \lim_{n \to \infty} \frac1n \sum_{i=1}^n (1\{x_i \geq 0\}|x_i|) - \lim_{n \to \infty} \frac1n \sum_{i=1}^n (1\{x_i \leq 0\}|x_i|) \\ &= \mathbb E[1\{x_1 \geq 0\}|x_1|] - \mathbb E[1\{x_1 \leq 0\}|x_1|] \\ &= \mathbb E[1\{x_1 \geq 0\}|x_1| - 1\{x_1 \leq 0\}|x_1|] \\ &= \mathbb E[x_i]. \end{align*}

So without loss of generality, we reduce to the assumption that $x_i \geq 0$.


Now define $y_i := x_i1\\{x_i \leq i\\}$, and $t_n = \frac1n \sum_{i=1}^n y_i$. Our first goal is to prove that $\limsup|s_n - t_n| = 0$.

Using an earlier lemma, the probability that $x_i \neq y_i$ infinitely often is zero since:

\begin{align*} \sum_{n=1}^\infty P(x_i \neq y_i) &= \sum_{i=1}^\infty P(x_i > i) \\ &\leq P(x_1 \geq 0) + \int_0^\infty P(x_1 \geq c)dc \\ &= P(x_1 \geq 0) + \mathbb E[x_1] \\ &< \infty. \end{align*}

Therefore, by the following lemma we have $P(\limsup |s_n - t_n| = 0) = 1$.


Lemma. $\sum_{i=1}^\infty 1\\{x_i \neq y_i\\} < \infty$ implies that $\limsup |s_n - t_n| = 0$.

Proof. Let $n_0$ be such that $\sum_{i=n_0}^\infty 1\\{x_i \neq y_i\\} = 0$. This exists because:

\begin{align*} \sum_{i=1}^\infty 1\{x_i \neq y_i\} < \infty &\implies \lim_{n \to \infty} \sum_{i=n}^\infty 1\{x_i \neq y_i\} = 0 \\ &\implies (\exists N \geq 1)(\forall n \geq N) \sum_{i=n}^\infty 1\{x_i \neq y_i\} < 1 \\ &\implies (\exists n_0 \geq 1) \sum_{i=n_0}^\infty 1\{x_i \neq y_i\} = 0 \\ \end{align*}

Therefore we have:

\begin{align*} \limsup |s_n - t_n| &= \limsup \left|\frac{n_0s_{n_0} + \sum_{i=n_0}^n x_i}{n} - \frac{n_0t_{n_0} + \sum_{i=n_0}^n y_i}{n}\right| \\ &= \limsup_{n \to \infty} (n_0/n)|s_{n_0} - t_{n_0}| = 0. \end{align*}

QED.

Remark: This proof works for arbitrary sequences $x_i$ and $y_i$.


Our proof goal therefore reduces to showing that $\lim_{n \to \infty} t_n = \mu$. The nonnegativity of $y_i$ means that we can control it by subsequences. We first need the following lemma.


Lemma. Let $y_i \geq 0$ be any nonnegative sequence, and $t_n := \frac1n \sum_{i=1}^n y_i$. Then for any subsequence $k(n)$:

$$\limsup_{n \to \infty} \frac{k(n)}{k(n+1)} t_{k(n)} \leq \liminf_{n \to \infty} t_n \leq \limsup_{n \to \infty} t_n \leq \liminf_{n \to \infty} \frac{k(n+1)}{k(n)} t_{k(n+1)}$$

Proof.

Let $m \in \mathbb N$ be any number in between $k(n) \leq m < k(n+1)$.

$$t_m = \frac1m \sum_{i=1}^m y_i \geq \frac1m \sum_{i=1}^{k(n)} y_i = \frac{k(n)}{m} t_{k(n)} \geq \frac{k(n)}{k(n+1)} t_{k(n)}$$
$$t_m = \frac1m \sum_{i=1}^m y_i \leq \frac1m \sum_{i=1}^{k(n+1)} y_i = \frac{k(n+1)}{m} t_{k(n+1)} \leq \frac{k(n+1)}{k(n)} t_{k(n+1)}$$

Therefore:

$$\liminf_{m \to \infty} t_m \geq \limsup_{n \to \infty} \frac{k(n)}{k(n+1)} t_{k(n)}$$
$$\limsup_{m \to \infty} t_m \leq \liminf_{n \to \infty} \frac{k(n+1)}{k(n)} t_{k(n+1)}$$

QED.


Let $\alpha > 1$. Then for $k(n) = \lfloor \alpha^n \rfloor$, the above lemma says that

$$(1/\alpha)\limsup_{n \to \infty} t_{\lfloor \alpha^n \rfloor} \leq \liminf_{n \to \infty} t_n \leq \limsup_{n \to \infty} t_n \leq \alpha\liminf_{n \to \infty} t_{\lfloor \alpha^n \rfloor}$$

Therefore the proof further reduces to showing that $P(\lim_{n \to \infty} t_{\lfloor \alpha^n \rfloor} = \mu) = 1$ for all $\alpha > 1$. In complete painful detail:

\begin{align*} \sum_{n=1}^\infty P(|t_{\lfloor \alpha^n \rfloor} - \mu| \geq \varepsilon) &\leq \sum_{n=1}^\infty \frac{1}{\varepsilon^2}\mathbb E[(t_{\lfloor \alpha^n \rfloor} - \mu)^2] \\ &= \frac{1}{\varepsilon^2}\sum_{n=1}^\infty \frac{1}{\lfloor \alpha^n \rfloor^2} \sum_{i=1}^{\lfloor \alpha^n \rfloor} \mathbb E[(y_i-\mu)^2] \\ &= \frac{1}{\varepsilon^2}\sum_{n, i \geq 1} 1\{i \leq \lfloor \alpha^n \rfloor\}\frac{1}{\lfloor \alpha^n \rfloor^2}\mathbb E[(y_i-\mu)^2] \\ &= \frac{1}{\varepsilon^2}\sum_{i=1}^\infty \mathbb E[(y_i-\mu)^2] \sum_{n : \lfloor \alpha^n \rfloor \geq i}\frac{1}{\lfloor \alpha^n \rfloor^2} \\ &\leq \frac{1}{\varepsilon^2}\sum_{i=1}^\infty \mathbb E[(y_i-\mu)^2] \sum_{n : \alpha^n \geq i}\frac{1}{(\alpha^n/2)^2} \\ &= \frac{1}{\varepsilon^2}\sum_{i=1}^\infty \mathbb E[(y_i-\mu)^2] (\alpha^{\min\{n : \alpha^n \geq i\}}/2)^{-2}\sum_{n \geq 0}(\alpha^{-2})^n \\ &= \frac{1}{\varepsilon^2}\sum_{i=1}^\infty \mathbb E[(y_i-\mu)^2] 4(\alpha^{\min\{n : \alpha^n \geq i\}})^{-2} \frac{1}{1 - \alpha^{-2}} \\ &= \frac{4}{\varepsilon^2(1 - \alpha^{-2})}\sum_{i=1}^\infty (\alpha^{\min\{n : \alpha^n \geq i\}})^{-2} \mathbb E[(y_i-\mu)^2] \\ &\leq \frac{4}{\varepsilon^2(1 - \alpha^{-2})}\sum_{i=1}^\infty i^{-2}\mathbb E[(y_i-\mu)^2] \\ &\leq \frac{4}{\varepsilon^2(1 - \alpha^{-2})}\sum_{i=1}^\infty i^{-2}\mathbb E[y_i^2] \\ &= \frac{4}{\varepsilon^2(1 - \alpha^{-2})}\sum_{i=1}^\infty i^{-2}\mathbb E\left[\sum_{j=1}^\infty 1\{j-1 < y_i \leq j\}y_i^2\right] \\ &= \frac{4}{\varepsilon^2(1 - \alpha^{-2})}\sum_{i=1}^\infty i^{-2}\sum_{j=1}^\infty \mathbb E[1\{j-1 < y_i \leq j\}y_i^2] \\ &= \frac{4}{\varepsilon^2(1 - \alpha^{-2})}\sum_{i=1}^\infty i^{-2}\sum_{j=1}^\infty \mathbb E[1\{j-1 < x_1 \leq j\}(1\{j \leq i\}x_1)^2] \\ &= \frac{4}{\varepsilon^2(1 - \alpha^{-2})}\sum_{i=1}^\infty i^{-2}\sum_{j=1}^\infty 1\{j \leq i\}\mathbb E[1\{j-1 < x_1 \leq j\}x_1^2] \\ &= \frac{4}{\varepsilon^2(1 - \alpha^{-2})}\sum_{i,j \geq 1} 1\{j \leq i\} i^{-2}\mathbb E[1\{j-1 < x_1 \leq j\}x_1^2] \\ &= \frac{4}{\varepsilon^2(1 - \alpha^{-2})}\sum_{j=1}^\infty \sum_{i \geq j} i^{-2}1 \mathbb E[1\{j-1 < x_1 \leq j\}x_1^2] \\ &= \frac{4}{\varepsilon^2(1 - \alpha^{-2})}\sum_{j=1}^\infty\mathbb E[1\{j-1 < x_1 \leq j\}x_1^2] \sum_{i \geq j} i^{-2} \\ &= \frac{4}{\varepsilon^2(1 - \alpha^{-2})}\sum_{j=1}^\infty\mathbb E[1\{j-1 < x_1 \leq j\}x_1^2] \left(j^{-2} + \sum_{i > j} i^{-2}\right) \\ &< \frac{4}{\varepsilon^2(1 - \alpha^{-2})}\sum_{j=1}^\infty\mathbb E[1\{j-1 < x_1 \leq j\}x_1^2] \left(j^{-2} + \int_j^\infty c^{-2}dc\right) \\ &= \frac{4}{\varepsilon^2(1 - \alpha^{-2})}\sum_{j=1}^\infty\mathbb E[1\{j-1 < x_1 \leq j\}x_1^2] \left(j^{-2} + j^{-1}\right) \\ &\leq \frac{4}{\varepsilon^2(1 - \alpha^{-2})}\sum_{j=1}^\infty\mathbb E[1\{j-1 < x_1 \leq j\}x_1^2] 2j^{-1} \\ &= \frac{8}{\varepsilon^2(1 - \alpha^{-2})}\sum_{j=1}^\infty\mathbb E[1\{j-1 < x_1 \leq j\}(x_1/j)x_1] \\ &\leq \frac{8}{\varepsilon^2(1 - \alpha^{-2})}\sum_{j=1}^\infty\mathbb E[1\{j-1 < x_1 \leq j\}x_1] \\ &= \frac{8}{\varepsilon^2(1 - \alpha^{-2})}\mathbb E[x_1] \\ &< \infty. \end{align*}

Therefore $P(\lim_{n \to \infty} t_{\lfloor \alpha^n \rfloor} = \mu) = 1$. By our earlier result, this proves that $P((1/\alpha)\mu \leq \liminf_{m \to \infty} t_m \leq \limsup_{m \to \infty} t_m \leq \alpha \mu) = 1$ for every $\alpha > 1$. Therefore

\begin{align*} P\left(\lim_{m \to \infty} t_m = \mu\right) &= P\left(\bigcap_{\varepsilon > 0} \{\liminf t_m \geq (\mu - \varepsilon)\} \cap \{\limsup t_m \leq (\mu + \varepsilon)\}\right) \\ &= P\left(\bigcap_{k \geq 1} \{\liminf t_m \geq (\tfrac{1}{1+1/k})\mu\} \cap \{\limsup t_m \leq (1 + 1/k)\mu\}\right) \\ &= \lim_{k \to \infty} P\left(\{\liminf t_m \geq (\tfrac{1}{1+1/k})\mu\} \cap \{\limsup t_m \leq (1 + 1/k)\mu\}\right) \\ &= \lim_{\alpha \searrow 1} P\left(\{\liminf t_m \geq (1/\alpha)\mu\} \cap \{\limsup t_m \leq \alpha\mu\}\right) \\ &= \lim_{\alpha \searrow 1} 1 \\ &= 1. \end{align*}

We also showed that $P(\limsup_{n \to \infty} |s_n - t_n| = 0) = 1$, so

\begin{align*} P\left(\lim_{n \to \infty} s_n \neq \mu\right) &\leq P\left(\left\{\limsup_{n \to \infty} |s_n - t_n| \neq 0\right\} \cup \left\{\lim_{n \to \infty} t_n \neq \mu\right\}\right) \\ &\leq P\left(\limsup_{n \to \infty} |s_n - t_n| \neq 0\right) + P\left(\lim_{n \to \infty} t_n \neq \mu\right) \\ &= 0 + 0 \\ &= 0. \end{align*}

QED.


References

Etemadi, N. “An elementary proof of the strong law of large numbers.” Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, vol. 55, 1981, pp. 119–122. SpringerLink, https://doi.org/10.1007/BF01013465.

Durrett, Rick. Probability: Theory and Examples. 5th ed., Cambridge University Press, 2019.

Tao, Terence. “254A, Notes 1: Concentration of measure.” What’s New, 3 Jan. 2010, Blog post https://terrytao.wordpress.com/2010/01/03/254a-notes-1-concentration-of-measure/.