AMH - 3__probability.3_discrete_variance

Definition

The standard deviation of a discrete random variable is a way to measure the distance of the values of a random variable to the mean.

Let \(X\) be a discrete random variable. The standard deviation of \(X\) is \[\sigma_X = \sqrt{E[(X - E[X])^2]}\]

The variance of a discrete random variable is the square of the standard deviation.

Let \(X\) be a discrete random variable. The variance of \(X\) is \[\sigma_X^2 = E[(X - E[X])^2]\] The variance is also written \(\text{Var}(X)\)

Intuition for the formula

Recall the distance formula from algebra. The distance between points \((x_1, y_1)\) and \((x_2, y_2),\) the distance between the points is \[\sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2}\] Similarly, in \(3\)-dimentions, the distance between the points \((x_1, y_1, z_1)\) and \((x_2, y_2, z_2)\) is \[\sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2 + (z_1 - z_2)^2}\] This continues for any number of coordinates.

For this example, let \(X\) be a discrete random variable that has non-zero probability on the points \(\{a_1, a_2, \dots, a_n\}.\) We can make one \(n\)-dimensional of point that stores the values that \(X\) may be: \[(a_1, a_2, \dots, a_n )\] We can make another \(n\)-dimensional point that has \(E[X]\) in every coordinate: \[(E[X], E[X], \dots, E[X])\] The standard deviation of \(X\) is the distance between these two points weighted by the probabilities of \(X:\) \[\sigma_X = \sqrt{P(X = a_1)(a_1 - E[X])^2 + \dots + P(X = a_n)(a_n - E[X])^2}\]

Example using the distance formula: Let \(X\) be a random variable defined by \[P(X = 2) = 0.8, P(X = 5) = 0.2\] We will use the distance formula to find the standard deviation.

The first point is the set of values \(X\) can be. This is \((2, 5).\)

The second point should be the same number of coordinates as the first, \(2\) in this case, and every coordinate should be \(E[X].\) We have to find \(E[X].\) \[E[X] = 2 \cdot 0.8 + 5 \cdot 0.2 = 1.6 + 1 = 2.6\] So, the second point is \((2.6, 2.6).\)

Finally, we need to find the distance between \((2, 5)\) and \((2.6, 2.6),\) but we need to weight the squares in terms of how likely it is that \(X\) has those values. The geometry distance formula is \[\sqrt{(2 - 2.6)^2 + (5 - 2.6)^2}\] With the weights, the standard deviation is \begin{align} \sqrt{0.8 \cdot (2 - 2.6)^2 + 0.2 \cdot (5 - 2.6)^2} & = \sqrt{0.8 \cdot (0.6)^2 + 0.2 \cdot (2.4)^2} \\ & = \sqrt{1.44} \\ & = 1.2 \end{align} So, \(\sigma_X = 1.2.\)

Variance

We will work with variance more often than standard deviation. There are two reasons for this.

Variance is easier to compute because the square root is a messy operator. Once we find the variance, we can always find the standard deviation by taking the square root.
Variance and standard deviation measure how "spread out" the values of a random variable are. You can get an intuition for how spread out a random variable is with either the variance or standard deviation since the square root preserves the order of values, meaning \(a > b\) if and only if \(\sqrt{a} > \sqrt{b}\).

Alternate Variance Formula

We can use the linearity of expectation to derive an alternate variance formula, one which is often easier to compute. \begin{align} \text{Var}(X) & = E[(X - E[X])^2] \\ & = E[X^2 - 2XE[X] + E[X]^2] \\ & = E[X^2] - E[2XE[X]] + E[E[X]^2] \\ & = E[X^2] - 2E[X]E[X] + E[X]^2 \\ & = E[X^2] - E[X]^2 \end{align} So, \(\text{Var}(X) = E[X^2] - E[X]^2.\)

Example: Let's compute the standard deviation of the same random variable \(X\) that we computed using the distance formula above. In that example, we defined \(X\) by \[P(X = 2) = 0.8, P(X = 5) = 0.2\] By squaring both sides, we get \[P(X^2 = 4) = 0.8, P(X^2 = 25) = 0.2\] Now we can compute the expected values: \begin{align} & E[X] = 2 \cdot 0.8 + 5 \cdot 0.2 = 1.6 + 1 = 2.6 \\ & E[X^2] = 4 \cdot 0.8 + 25 \cdot 0.2 = 3.2 + 5 = 8.2 \end{align} Last, we plug in the values into the variance formula. \begin{align} text{Var}(X) & = E[X^2] - E[X]^2 \\ & = 8.2 - 2.6^2 \\ & = 1.44 \end{align} So, \(text{Var}(X) = 1.44.\)

The standard deviation of \(X\) is \(\sigma_X = \sqrt{1.44} = 1.2,\) which is what we found using the distance formula method. Both methods give the same answer.

Check your understanding:

1. A fair coin is flipped twice. Let \(X\) be the number of heads. What is \(\text{Var}(X)?\)

Unanswered

2. A random variable \(X\) has the following distribution: \[P(X = 0) = 0.1, P(X = 1) = 0.3, P(X = 2) = 0.2, P(X = 3) = 0.4\] What is the standard deviation of \(X?\)

Unanswered

3. A random variable is known to have the following: \[E[2X^2-1] = 3, \text{Var}(X) = 1.5\] What is \(E[X]?\)

Unanswered

4. If \(\text{Var}(X) = 121,\) which of the following must be true?

Unanswered