1 Random Variables

Reading: If you want a little more information about the topics in this chapter, take a look at Dougherty R.2 - R.4 (pages 7 - 23).

1.1 Chapter Preview

In this chapter, you will learn:

The difference between a discrete and continuous random variable,
How to use the probability distribution of a discrete random variable to calculate the expected value and variance of the random variable
How to use the probability density function (PDF) of a continuous random variable to calculate the expected value and variance of the random variable
How to calculate the covariance and correlation between two random variables

Why? These probability concepts are the foundation to everything we do in Econometrics.

1.1.1 Notation

Here’s a synopsis of the notation I’ll introduce in this chapter.

Symbol	Meaning
\(X\)	random variable (RV)
\(x_i\)	a potential outcome for the random variable \(X\)
\(p_i\)	the probability a certain outcome will occur (discrete RV’s)
\(\mu_X\)	pronounced “mew”, the expected value of the random variable \(X\), also known as \(E[X]\)
\(\sigma^2_X\)	the variance of \(X\), pronounced “sigma”
\(\sigma_X\)	the standard deviation of \(X\)
\(\sigma_{XY}\)	the covariance of random variables \(X\) and \(Y\)
\(\rho_{XY}\)	the correlation between two random variables \(X\) and \(Y\), pronounced “rho”

1.2 Random Variables

A random variable is any variable whose value cannot be predicted exactly. A few examples:

The message you get in a fortune cookie is a random variable.
The time you spend searching for your keys after you’ve misplaced them is a random variable.
The number of likes you get on a social media post is a random variable.
The number of customers who enter a small retail store on a given day is a random variable.
The sales revenue at that small retail store on a given day is a random variable.

Some random variables are discrete while others are continuous. What’s the difference? Discrete random variables are counted like the number of M&M’s you have, while continuous random variables are measured like how heavy your bag of candy is. Discrete random variables take on a small number of possible values, while continuous random variables can take on an infinite number of possible values.

Other variables are categorical instead of being numeric. They may represent qualitative data that can be divided into categories or groups. We’ll lump categorical variables in with discrete variables and we’ll explain why toward the end of this course (see Chapter 14: Dummy Variables).

Example: The message you get in a fortune cookie takes on values like “An exciting opportunity lies ahead of you”, which is not numeric, it’s categorical. We’ll therefore label this random variable a discrete random variable.

Example: The “time you spend searching for your keys after you’ve misplaced them” takes on values like 3.982 minutes. It’s a value that can be measured precisely instead of counted, so we’d label this random variable a continuous random variable.

Exercise 1: Identify which of these might be best modeled as discrete random variables, and which might be best modeled as continuous random variables.

The number of likes you get on a social media post
The weight of a truck
The number of customers who enter a small retail store on a given day
The height of a child
The speed of a train

1.3 Discrete Probability Distributions

Consider the discrete random variable “a dice roll”. It could take on values 1 to 6 and if it’s a fair die, it takes on each of those values with probability 1/6. Our notation will be: X is the random variable, \(x_i\) is a potential outcome for \(X\), and each potential outcome \(x_i\) happens with probability \(p_i\):

Table 1.1: Potential Outcomes and Probabilities for a Dice Roll
\(x_i\)	1	2	3	4	5	6
\(p_i\)	1/6	1/6	1/6	1/6	1/6	1/6

Consider another random variable \(X\) as the sum of two dice rolls. In Table 1.1, the first row represents the potential outcomes for the first roll and the first column represents the potential outcomes for the second roll. The values in the interior of the table represent potential outcomes for \(X\) (the sum).

Table 1.2: Potential Outcomes for the Sum of 2 Dice Rolls
	1	2	3	4	5	6
1	2	3	4	5	6	7
2	3	4	5	6	7	8
3	4	5	6	7	8	9
4	5	6	7	8	9	10
5	6	7	8	9	10	11
6	7	8	9	10	11	12

Each of the cells in Table 1.2 can occur with equal probability. That implies that X = 2 with probability 1/36. And since there are two ways to get to X = 3, this outcome occurs with probability 2/36. Table 1.3 shows the probabilities for each X potential outcome:

Table 1.3: Potential Outcomes and Probabilities for the Sum of 2 Dice Rolls
\(x_i\)	2	3	4	5	6	7	8	9	10	11	12
\(p_i\)	1/36	2/36	3/36	4/36	5/36	6/36	5/36	4/36	3/36	2/36	1/36

Exercise 2: Let X be the random variable “the number of likes you get on a social media post”. Suppose that you get 0-4 likes per post each with equal probability. Fill out Table 1.4 below.

Table 1.4: Number of Likes on Social Media: Potential Outcomes and Probabilities
\(x_i\)	0	1	2	3	4
\(p_i\)	__	__	__	__	__

1.4 Expected Values of Discrete Random Variables

The expected value of a random variable is its long-term average. We’ll reserve the greek letter pronounced “mew” \(\mu\) to refer to expected values: that is, we’ll say that the expected value of \(X\) is \(\mu_X\), or that \(E[X] = \mu_X\). If the variable is discrete, you can calculate its expectation by taking the sum of all possible values of the random variable, each multiplied by their corresponding probabilities. So the expectation of a discrete random variable \(X\) is \(E[X] = \sum_i x_i p_i\), where \(x_i\) is a potential outcome for \(X\) and \(p_i\) is the probability that outcome occurs.

Example: To find the expected value \(E[X]\) of a dice roll, consult Table 1.1: \[\begin{align} E[X] &= \sum_{i = 1}^n x_i p_i \\ &= 1 (1/6) + 2 (1/6) + 3 (1/6)\\ &\hspace{.7cm} + 4 (1/6) + 5 (1/6) + 6 (1/6) \\ & = 21/6 \\ &= 3.5 \end{align}\]

Exercise 3: Let X be the random variable “the number of likes you get on a social media post” and suppose that you get 0-4 likes per post each with equal probability. How many likes do you get in expectation for a post? That is, what is \(E[X]\)?

1.4.1 Expected Value Rules

Here are some math rules about the way expected values work. Let \(X\), \(Y\), and \(Z\) be random variables and let \(b\) be a constant.

The expectation of the sum of several random variables is the sum of their expectations: \(E[X + Y + Z] = E[X] + E[Y] + E[Z]\).
Constants can pass outside of an expectation: \(E[bX] = b E[X]\)
The expected value of a constant is that constant: \(E[b] = b\).

Example: Let \(X\) and \(Y\) be random variables and let \(b_1\) and \(b_2\) be consants. If \(Y = b_1 + b_2 X\), then since \(E[Y] = E[b_1 + b_2 X]\), we can simplify using the rules above to get: \(E[Y] = b_1 + b_2 E[X]\).

Exercise 4: Let \(X\) be a random variable and suppose \(E[X] = 3\). If \(Y = 3 + 5 X\), what is \(E[Y]\)?

1.5 Variance

The variance of a random variable measures its dispersion: it asks “on average, how far is the variable from its average”? Differences are squared to get rid of the negative sign and punish large deviances a little more. We’ll reserve the greek letter pronounced “sigma” \(\sigma\) for variance (\(\sigma^2\)) and standard deviation (\(\sigma\)). The formula:

\[\begin{align} Var(X) = \sigma_X^2 &= E\left[(X - \mu_X)^2\right]\\ &= (x_1 - \mu_X)^2 p_1 + (x_2 - \mu_X)^2 p_2 + ... + (x_n - \mu_X)^2 p_n\\ &= \sum_{i = 1}^n (x_i - \mu_X)^2 p_i \end{align}\]

Notice that because of the square and the fact that probabilities \(p_i\) are never negative, the variance of a random variable can never be a negative number.

Example: Let \(X\) be a dice roll. Then to find \(Var(X)\), recall that we already calculated \(\mu_X = 3.5\), so: \[\begin{align} Var(X) &= E[(X - 3.5)^2]\\ &= \frac{1}{6} (1 - 3.5)^2 + \frac{1}{6} (2 - 3.5)^2 + \frac{1}{6} (3 - 3.5)^2 + \\ &\hspace{.7cm} \frac{1}{6} (4 - 3.5)^2 + \frac{1}{6} (5 - 3.5)^2 + \frac{1}{6} (6 - 3.5)^2 \\ &= \frac{17.5}{6} \\ &\approx 2.9167 \end{align}\] This is a measure of the dispersion of a dice roll.

Exercise 5: Again let \(X\) be the random variable “the number of likes you get on a social media post” where you get 0-4 likes per post each with equal probability. What is the variance of \(X\)?

1.5.1 Variance Rules

Here are some rules about the way variance works. Let \(X\) and \(Y\) be random variables and let \(b\) be a constant.

The variance of the sum of two random variables is the sum of their variances plus two times their covariance: \(Var(X + Y) = Var(X) + Var(Y) + 2 Cov(X, Y)\)
Constants can pass outside of a variance if you square them: \(Var(bX) = b^2 Var(X)\)
The variance of a constant is 0: \(Var(b) = 0\).
The variance of a random variable plus a constant is the variance of that random variable: \(Var(X + b) = Var(X)\).

Exercise 6: Let \(X\) be a random variable and suppose \(Var(X) = 3\). If \(Y = 3 + 5 X\), what is \(Var(Y)\)?

1.6 Covariance

The covariance of two random variables (\(\sigma_{XY}\)) is a measure of the linear association between those variables. For example, since people who are taller are generally heavier, we’d say that the random variables \(height\) and \(weight\) have a positive covariance. On the other hand, if large values for one random variable tend to correspond to small values in the other, we’d say the two variables have a negative covariance (think of temperature and sales of winter coats). Two variables that are independent have a covariance of 0. The formula for covariance:

\[Cov(X, Y) = \sigma_{XY} = E[(X - \mu_X) (Y - \mu_Y)]\]

Notice that the covariance of a random variable \(X\) with itself is the variance of \(X\).

1.6.1 Covariance Rules

Here are some rules about the way covariance works. Let \(X\), \(Y\), and \(Z\) be random variables and let \(b\) be a constant.

The covariance of a random variable with a constant is 0: \(Cov(X, b) = 0\).
As mentioned above, the covariance of a random variable with itself is its variance: \(Cov(X, X) = Var(X)\).
You can bring constants outside of the covariance: \(Cov(X, bY) = b Cov(X, Y)\).
If Z is a third random variable: \(Cov(X, Y + Z) = Cov(X, Y) + Cov(X, Z)\)

Exercise 7: Let \(X\) and \(Y\) be random variables. Simplify \(Cov(X, Y + 3X)\).

1.7 Correlation

An issue with covariance is that the covariance between two random variables depends on the units those variables are measured in. That’s where correlation comes in: correlation is another measure of linear association that has the benefit of being dimensionless because the units in the numerator cancel with the units in the denominator. It also turns out that the correlation between two variables is always between -1 and 1. When correlation = 1, the two variables have a perfect positive linear relationship, and when correlation = -1, the two variables have a perfect negative linear relationship.

We’ll reserve the greek letter “rho” \(\rho\) to refer to the correlation between two random variables. The formula:

\[\rho_{XY} = \frac{\sigma_{XY}}{\sqrt{\sigma_X^2 \sigma_Y^2}}\]

That is, correlation is the covariance of two random variables divided by the square root of the variances of both random variables multiplied together.

Exercise 8: Suppose that \(Cov(X, Y) = 2\), \(Var(X) = 4\), and \(Var(Y) = 4\). What is the correlation between \(X\) and \(Y\)?

1.8 Continuous Random Variables

When the variable can take on an infinite number of possible values, the probability it takes on any given value must be zero. So for continuous random variables, we use probability density functions (PDF) and we talk about the probability the random variable lies within an interval.

For example, suppose that the “time spent searching for your keys” is a continuous random variable with uniform probability between 0 and 5 minutes. Figure 1.1 illustrates the PDF: since the area under the PDF must be 1, we know that the height of the rectangle is 1/5.

Example: In the example above in Figure 1.1, the probability that you spend less than 1 minute searching for your keys is \(1 \times \frac{1}{5} = 0.20\).

Exercise 9: In Figure 1.1, what is the probability you spend 3 minutes or more searching for your keys?

To find the expected value or variance of a continuous random variable instead of a discrete random variable, just swap out integrals for sums and the PDF \(f(X)\) for \(p_i\):

	\(E[X]\)	\(Var(X) = E[(X - \mu_X)^2]\)
Discrete	\(\sum_{i = 1}^n x_i p_i\)	\(\sum_{i=1}^n (x_i - \mu_X)^2p_i\)
Continuous	\(\int X f(X) dX\)	\(\int (X- \mu_X)^2 f(X) dX\)

1.9 Conclusion

So there you have it: now you’re refreshed on the difference between discrete and continuous random variables and you have some comfort with using a probability distribution to find a random variable’s expected value and variance. You’ve also been introduced to the concepts of covariance and correlation.