Math Home
Probability

Let \(X_1, X_2, X_3, \dots\) be a sequence of i.i.d. random variables with mean \(\mu\) and variance \(0 < \sigma^2 < \infty.\) Let \(Z\) be a standard normal random variable. The Central Limit Theorem states that for any real number \(t,\) \[ P\left(\lim_{n \rightarrow \infty} \frac{\sum_{i=1}^n X_i - n\mu}{\sqrt{n\sigma^2}} < t\right) = P(Z < t)\]

Alternatively, rewriting the fraction yeilds \[ P\left(\lim_{n \rightarrow \infty} \frac{\frac{1}{n}\sum_{i=1}^n X_i - \mu}{\sqrt{\sigma^2/n}} < t\right) = P(Z < t)\]


For a fixed \(n,\) the sum \(X_1 + X_2 + \dots + X_n\) has mean \(n\mu\) since each \(X_i\) has mean \(\mu.\) Also, since the random variables are independent, \begin{align} \text{Var}(X_1 + X_2 + \dots + X_n) & = \text{Var}(X_1) + \text{Var}(X_2) + \dots + \text{Var}(X_n) \\ & = n\sigma^2 \end{align} So, the standard deviation of \(X_1 + X_2 + \dots + X_n\) is \(\sqrt{n\sigma^2}.\) Therefore, \[\frac{\sum_{i=1}^n X_i - n\mu}{\sqrt{n\sigma^2}}\] has mean \(0\) and variance \(1,\) just like a standard normal.

Similarly, \[E\left[\frac{1}{n}\sum_{i=1}^n X_i\right] = \mu\] and \begin{align} \text{Var}\left(\frac{1}{n}\sum_{i=1}^n X_i\right) & = \frac{1}{n^2}\sum_{i=1}^n \text{Var}(X_i) \\ & = \frac{1}{n^2} \cdot n\sigma^2 \\ & = \frac{\sigma^2}{n} \end{align} So, the standard deviation of \(\frac{1}{n}\sum_{i=1}^n X_i\) is \(\sqrt{\sigma^2/n}.\)

What the central limit theorem shows is that if you keep adding random variables, the distribution of the sum behaves like a normal distribution. However, in most cases the sum would tend to infinity or negative infinity or be very jumpy. The central limit theorem gives the offset and scaling factors necessary to transfrom the data into something meaningful.

Standard Normal Table

For convenience, we have included a standard normal table here.
Standard Normal Table:

Example

A standardized test has been given to all \(12\) year olds in the United States. The average score was \(83.1\%\) and the standard deviation was \(3.2.\) What is the probability that a random student scores above \(90\%?\)

Solution:
We can use the central limit theorem to find an approximation of a score for a randomly chosen student under the assumption that there were a lot of students that took the test and a randomly chosen student from the whole population should generally perform the same as another randomly chosen student.

Let \(Y\) be the score of a randomly chosen student. We can standardize \(Y\) and use the standard normal table. \begin{align} P(Y > 90) & = P\left(\frac{Y-83.1}{3.2} > 2.15625\right) \\ & \approx P\left(Z > 2.15625\right) \end{align} where \(Z\) is a standard normal random variable. Using the table, we find \begin{align} P(Z > 2.15625) & = 1 - P(Z \leq 2.15625) \\ & \approx 1 - 0.98422 \\ & = 0.01578 \end{align} The probability is \(0.01578,\) indicating that only about \(1.5\%\) of the students who took the test scored above \(90\%.\)

Example with A Binomial Random Variable

Let \(X\) be Binomial\((1276, 0.7).\) What is \(P(X < 700)?\)

Solution:
The probability is difficult to compute exactly, but we can use the central limit theorem to find a close approximation.

We can use the central limit theorem because a binomial random variable has the same distribution as a sum of i.i.d. Bernoulli random variables. In particular, let \(Y_1, Y_2, \dots, Y_{1276}\) be i.i.d. Bernoulli random variables. Then \(\sum_{i=1}^{1276}Y_i\) has the same distribution as \(X.\)

The mean of \(X\) is \(893.2\) and the standard deviation of \(X\) is approximately \(16.37.\) Therefore, \begin{align} P(X < 700) & = P\left(\sum_{i=1}^{1276}Y_i < 700\right) \\ & \approx P\left(\frac{\sum_{i=1}^{1276}Y_i - 893.2}{16.37} < -11.8\right) \\ & \approx P(Z < -11.8) \\ & \approx 0 \end{align} The probability is approximately \(0\) because it is highly unlikely that a standard normal is 11 standard deviations away from the mean. (More accurately, the probability is about \(0.00000000000000000000000000019\).)

Example with A Poisson Random Variables

Let \(X\) be Poisson\((965).\) Find \(P(X < 900).\)

Solution:
Recall that if \(X_1\) is Poisson\((\lambda_1)\) and \(X_2\) is Poisson\((\lambda_2),\) and \(X_1\) and \(X_2\) are independent, then \(X_1 + X_2\) is Poisson\((\lambda_1+\lambda_2).\)

Let \(Y_1, Y_2, \dots, Y_{965}\) be Poisson\((1)\) random variables. Then \(\sum_{i=1}^{965}Y_i\) is Poisson\((965).\)

The mean of \(X\) is \(965\) and the standard deviation of \(X\) is \(\sqrt{965} \approx 31.06.\) So, \begin{align} P(X < 900) & = P\left(\sum_{i=1}^{965} Y_i < 900\right) \\ & \approx P\left(\frac{\sum_{i=1}^{965} Y_i - 965}{31.06} < -2.09\right) \\ & \approx P(Z < -2.09) \end{align} We must rearrange the inequality to use the standard normal table. By symmetry, \[P(Z < -2.09) = P(Z \geq 2.09)\] Using \(P(A) = 1-P(A^C),\) \[P(Z \geq 2.09) = 1 - P(Z < 2.09)\] Now we can use the table, looking in row \(2.0\) and column \(.09.\) \begin{align} 1 - P(Z < 2.09) & \approx 1-.98169 \\ & = 0.018 \end{align}

1. A standard IQ test is designed so that an average IQ is \(100\) and the standard deviation is \(15.\) What percent of the population will have an IQ of \(125\) or higher?




Unanswered

2. The amount the price of a stock changes every day is modeled with an i.i.d. sequence of Normal\(0,1\) random variables, \((X_i : 1 \leq i \leq n),\) where a negative value indicates that the stock decreased for the day. So, if \(c\) is the price of the stock on day \(0,\) the price of the stock on day \(n\) is \[c + \sum_{i=1}^n X_i\] What is the probability that the price of the stock has changed (gone up or down) by at least $\(12\) in \(100\) days?




Unanswered

3. It is estimated that around \(40\%\) of babies born prematurely get jaundice. If a hospital sees \(948\) babies, what is the probability that \(400\) or more will have had jaundice?




Unanswered

4. A certain company makes approximately \(4,150\) sales on their website daily. What is the probability that they make less than \(4,000\) sales on a particular day?




Unanswered