Sum of two dependent normal distributions

A common misconception among students of probability theory is the belief that the sum of two normally distributed (http://planetmath.org/NormalRandomVariable) random variables

Sum of two dependent normal distributions
Sum of two dependent normal distributions
is itself normally distributed. By constructing a counterexample, we show this to be false.

It is however well known that the sum of normally distributed variables X,Y will be normal under either of the following situations.

  • X,Y are independent.

  • X,Y are joint normal (http://planetmath.org/JointNormalDistribution).

The statement that the sum of two independent normal random variables is itself normal is a very useful and often used property. Furthermore, when working with normal variables which are not independent, it is common to suppose that they are in fact joint normal. This can lead to the belief that this property holds always.

Another common fallacy, which our example shows to be false, is that normal random variables are independent if and only if their covariance

Sum of two dependent normal distributions
, defined by

Cov⁡(X,Y)≡𝔼⁢[X⁢Y]-𝔼⁢[X]⁢𝔼⁢[Y]

is zero. While it is certainly true that independent variables have zero covariance, the converse

Sum of two dependent normal distributions
statement does not hold. Again, it is often supposed that they are joint normal, in which case a zero covariance will indeed imply independence.

We construct a pair of random variables X,Y satisfying the following.

  1. 1.

    X and Y each have the standard normal distribution.

  2. 2.

    The covariance, Cov⁡(X,Y), is zero.

  3. 3.

    The sum X+Y is not normally distributed, and X and Y are not independent.

We start with a pair of independent random variables X,ϵ where X has the standard normal distribution and ϵ takes the values 1,-1, each with a probability of 1/2. Then set,

Y={ϵ⁢X,if ⁢|X|≤1,-ϵ⁢X,if ⁢|X|>1.

If S is any measurable subset of the real numbers, the symmetry of the normal distribution implies that ℙ(X∈S) is equal to ℙ(-X∈S). Then, by the independence of X and ϵ, the distribution of Y conditional

Sum of two dependent normal distributions
on ϵ=1 is given by,

ℙ(Y∈S∣ϵ=1)=ℙ(|X|≤1,X∈S)+ℙ(|X|>1,-X∈S)=ℙ(|X|≤1,X∈S)+ℙ(|X|>1,X∈S)=ℙ(X∈S).

It can similarly be shown that ℙ(Y∈S∣ϵ=-1) is equal to ℙ(X∈S). So, Y has the same distribution as X and is normal with mean zero and variance

Sum of two dependent normal distributions
one.

Using the fact that ϵ has zero mean and is independent of X, it is easily shown that the covariance of X and Y is zero.

Cov⁡(X,Y)=𝔼⁢[X⁢Y]-𝔼⁢[X]⁢𝔼⁢[Y]=𝔼⁢[X⁢Y]=𝔼⁢[1{|X|≤1}⁢ϵ⁢X2]+𝔼⁢[-1{|X|>1}⁢ϵ⁢X2]=𝔼⁢[ϵ]⁢𝔼⁢[1{|X|≤1}⁢X2]-𝔼⁢[ϵ]⁢𝔼⁢[1{|X|>1}⁢X2]=0.

As X and Y have zero covariance and each have variance equal to 1, the sum X+Y will have variance equal to 2. Also, the sum satisfies

X+Y={2⁢ϵ⁢X,if ⁢|X|≤1,0,if ⁢|X|>1.

In particular, this shows that ℙ(|X+Y|>2)=0. However, normal random variables with nonzero variance always have a positive probability of being greater than any given real number. So, X+Y is not normally distributed.

This also shows that, despite having zero covariance, X and Y are not independent. If they were, then the fact that sums of independent normals are normal would imply that X+Y is normal, contradicting what we have just demonstrated.

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

5.3.2 Bivariate Normal Distribution

Remember that the normal distribution is very important in probability theory and it shows up in many different applications. We have discussed a single normal random variable previously; we will now talk about two or more normal random variables. We recently saw in Theorem 5.2 that the sum of two independent normal random variables is also normal. However, if the two normal random variables are not independent, then their sum is not necessarily normal. Here is a simple counterexample: Example
Let $X \sim N(0,1)$ and $W \sim Bernoulli\left(\frac{1}{2}\right)$ be independent random variables. Define the random variable $Y$ as a function of $X$ and $W$: \begin{equation} \nonumber Y = h(X,W)=\left\{ \begin{array}{l l} X & \quad \textrm{if }W=0 \\ & \quad \\ -X & \quad \textrm{if }W=1 \end{array} \right. \end{equation} Find the PDF of $Y$ and $X+Y$.

  • Solution
    • Note that by symmetry of $N(0,1)$ around zero, $-X$ is also $N(0,1)$. In particular, we can write \begin{align}%\label{} F_Y(y) &=P(Y \leq y)\\ &=P(Y \leq y | W=0) P(W=0)+P(Y \leq y | W=1) P(W=1)\\ &=\frac{1}{2} P(X \leq y | W=0) +\frac{1}{2} P(-X \leq y | W=1) \\ &=\frac{1}{2} P(X \leq y) +\frac{1}{2} P(-X \leq y) \hspace{15pt} \textrm{(since $X$ and $W$ are independent)}\\ &=\frac{1}{2} \Phi (y) +\frac{1}{2} \Phi (y) \hspace{15pt} (\textrm{since $X$ and $-X$ are }N(0,1))\\ &=\Phi (y). \end{align} Thus, $Y \sim N(0,1)$. Now, note that \begin{equation} \nonumber Z=X+Y = \left\{ \begin{array}{l l} 2X & \quad \textrm{with probability }\frac{1}{2} \\ & \quad \\ 0 & \quad \textrm{with probability }\frac{1}{2} \end{array} \right. \end{equation} Thus, $Z$ is a mixed random variable and its PDF is given by \begin{align}%\label{} \nonumber f_Z(z)&=\frac{1}{2} \delta(z)+\frac{1}{2} \hspace{15pt}(\textrm{PDF of $2X$ at $z$}), \\ \nonumber f_Z(z)&=\frac{1}{2} \delta(z)+\frac{1}{2} \hspace{15pt}(\textrm{PDF of a $N(0,4)$ at $z$}) \\ \nonumber &=\frac{1}{2} \delta(z)+\frac{1}{4\sqrt{2 \pi}}e^{-\frac{z^2}{8}}. \end{align} In particular, note that $X$ and $Y$ are both normal but their sum is not. Now, we are ready to define bivariate normal or jointly normal random variables.


Definition
Two random variables $X$ and $Y$ are said to be bivariate normal, or jointly normal, if $aX+bY$ has a normal distribution for all $a,b \in \mathbb{R}$.

In the above definition, if we let $a=b=0$, then $aX+bY=0$. We agree that the constant zero is a normal random variable with mean and variance $0$. From the above definition, we can immediately conclude the following facts: - If $X$ and $Y$ are bivariate normal, then by letting $a=1$, $b=0$, we conclude $X$ must be normal. - If $X$ and $Y$ are bivariate normal, then by letting $a=0$, $b=1$, we conclude $Y$ must be normal. - If $X \sim N(\mu_X,\sigma^2_X)$ and $Y \sim N(\mu_Y,\sigma^2_Y)$ are independent, then they are jointly normal (Theorem 5.2). - If $X \sim N(\mu_X,\sigma^2_X)$ and $Y \sim N(\mu_Y,\sigma^2_Y)$ are jointly normal, then $X+Y \hspace{5pt} \sim \hspace{5pt} N(\mu_X+\mu_Y,\sigma^2_X+\sigma^2_Y+2 \rho(X,Y) \sigma_X \sigma_Y)$ (Equation 5.21).

But how can we obtain the joint normal PDF in general? Can we provide a simple way to generate jointly normal random variables? The basic idea is that we can start from several independent random variables and by considering their linear combinations, we can obtain bivariate normal random variables. Similar to our discussion on normal random variables, we start by introducing the standard bivariate normal distribution and then obtain the general case from the standard one. The following example gives the idea.

Example
Let $Z_1$ and $Z_2$ be two independent $N(0,1)$ random variables. Define \begin{align}%\label{} \nonumber X&=Z_1, \\ \nonumber Y&=\rho Z_1 +\sqrt{1-\rho^2} Z_2, \end{align} where $\rho$ is a real number in $(-1,1)$.
  1. Show that $X$ and $Y$ are bivariate normal.
  2. Find the joint PDF of $X$ and $Y$.
  3. Find $\rho(X,Y)$.

  • Solution
    • First, note that since $Z_1$ and $Z_2$ are normal and independent, they are jointly normal, with the joint PDF \begin{align}%\label{} f_{Z_1Z_2}(z_1,z_2)&=f_{Z_1}(z_1)f_{Z_2}(z_2)\\ &=\frac{1}{2 \pi} \exp \bigg\{-\frac{1}{2} \big[ z_1^2+z^2_2\big] \bigg\}. \end{align}
      1. We need to show $aX+bY$ is normal for all $a,b \in \mathbb{R}$. We have \begin{align} \nonumber aX+bY&=aZ_1+b(\rho Z_1 +\sqrt{1-\rho^2} Z_2)\\ \nonumber &=(a+b \rho)Z_1+b\sqrt{1-\rho^2} Z_2, \end{align} which is a linear combination of $Z_1$ and $Z_2$ and thus it is normal.
      2. We can use the method of transformations (Theorem 5.1) to find the joint PDF of $X$ and $Y$. The inverse transformation is given by \begin{align}%\label{} \nonumber Z_1&=X=h_1(X,Y), \\ \nonumber Z_2&=-\frac{\rho}{\sqrt{1-\rho^2}} X+\frac{1}{\sqrt{1-\rho^2}}Y=h_2(X,Y). \end{align} We have \begin{align} \nonumber f_{XY}(z_1,z_2)&=f_{Z_1Z_2}(h_1(x,y),h_2(x,y)) |J|\\ \nonumber &=f_{Z_1Z_2}(x,-\frac{\rho}{\sqrt{1-\rho^2}} x+\frac{1}{\sqrt{1-\rho^2}}y) |J|, \end{align} where \begin{align} \nonumber J= \det \begin{bmatrix} \frac{\partial h_1}{\partial x} & \frac{\partial h_1}{\partial y} \\ & \\ \frac{\partial h_2}{\partial x} & \frac{\partial h_2}{\partial y} \\ \end{bmatrix} =\det \begin{bmatrix} 1 & 0 \\ & \\ -\frac{\rho}{\sqrt{1-\rho^2}} & \frac{1}{\sqrt{1-\rho^2}} \\ \end{bmatrix} =\frac{1}{\sqrt{1-\rho^2}}. \end{align} Thus, we conclude that \begin{align} \nonumber f_{XY}(x,y)&=f_{Z_1Z_2}(x,-\frac{\rho}{\sqrt{1-\rho^2}} x+\frac{1}{\sqrt{1-\rho^2}}y) |J|\\ \nonumber &=\frac{1}{2 \pi} \exp \left\{-\frac{1}{2} \bigg[ x^2+\frac{1}{1-\rho^2}(-\rho x+y)^2 \bigg] \right\} \cdot \frac{1}{\sqrt{1-\rho^2}}\\ \nonumber &=\frac{1}{2 \pi \sqrt{1-\rho^2}} \exp \bigg\{-\frac{1}{2 (1-\rho^2)} \big[ x^2-2\rho x y+y^2 \big] \bigg\}. \end{align}
      3. To find $\rho(X,Y)$, first note \begin{align} &Var(X)=Var(Z_1)=1,\\ &Var(Y)=\rho^2 Var(Z_1)+(1-\rho^2) Var(Z_2)=1. \end{align} Therefore, \begin{align} \nonumber \rho(X,Y) &=\textrm{Cov}(X,Y)\\ \nonumber &=\textrm{Cov}(Z_1,\rho Z_1 +\sqrt{1-\rho^2} Z_2)\\ \nonumber &=\rho \textrm{Cov}(Z_1,Z_1)+\sqrt{1-\rho^2} \textrm{Cov}(Z_1,Z_2)\\ \nonumber &=\rho \cdot 1+ \sqrt{1-\rho^2} \cdot 0\\ \nonumber &=\rho. \end{align}


We call the above joint distribution for $X$ and $Y$ the standard bivariate normal distribution with correlation coefficient $\mathbf{\rho}$. It is the distribution for two jointly normal random variables when their variances are equal to one and their correlation coefficient is $\rho$.

Two random variables $X$ and $Y$ are said to have the standard bivariate normal distribution with correlation coefficient $\mathbf{\rho}$ if their joint PDF is given by \begin{align} \nonumber f_{XY}(x,y)=\frac{1}{2 \pi \sqrt{1-\rho^2}} \exp \{-\frac{1}{2 (1-\rho^2)} \big[ x^2-2\rho x y+y^2 \big] \}, \end{align} where $\rho \in (-1,1)$. If $\rho=0$, then we just say $X$ and $Y$ have the standard bivariate normal distribution.

Now, if you want two jointly normal random variables $X$ and $Y$ such that $X \sim N(\mu_X,\sigma^2_X)$, $Y \sim N(\mu_Y,\sigma^2_Y)$, and $\rho(X,Y)=\rho$, you can start with two independent $N(0,1)$ random variables, $Z_1$ and $Z_2$, and define \begin{align} \label{eq:bivariate-conv} \left\{ \begin{array}{l l} X&=\sigma_X Z_1+\mu_X \\ Y&=\sigma_Y (\rho Z_1 +\sqrt{1-\rho^2} Z_2)+\mu_Y \hspace{20pt} (5.23) \end{array} \right. \end{align} We can find the joint PDF of $X$ and $Y$ as above. While the joint PDF has a big formula, we usually do not need to use the formula itself. Instead, we usually work with properties of jointly normal random variables such as their mean, variance, and covariance.

Definitions 5.3 and 5.4 are equivalent in the sense that, if X and Y are jointly normal based on one definition, they are jointly normal based on the other definition, too. The proof of their equivalence can be concluded from Problem 10 in Section 6.1.6. In that problem, we show that the two definitions result in the same moment generating functions.

Definition
Two random variables $X$ and $Y$ are said to have a bivariate normal distribution with parameters $\mu_X$, $\sigma^2_X$, $\mu_Y$, $\sigma^2_Y$, and $\rho$, if their joint PDF is given by \begin{align} \label{eq:bivariate-normal} \nonumber &f_{XY}(x,y)=\frac{1}{2 \pi \sigma_X \sigma_Y \sqrt{1-\rho^2}} \cdot\\ &\exp \left\{-\frac{1}{2 (1-\rho^2)}\bigg[\bigg(\frac{x-\mu_X}{\sigma_X}\bigg)^2 +\bigg(\frac{y-\mu_Y}{\sigma_Y}\bigg)^2-2\rho \frac{(x-\mu_X)(y-\mu_Y)}{\sigma_X \sigma_Y} \bigg] \right\} (5.24) \end{align} where $\mu_X,\mu_Y \in \mathbb{R}$, $\sigma_X,\sigma_Y>0$ and $\rho \in (-1,1)$ are all constants.

In the above discussion, we introduced bivariate normal distributions by starting from independent normal random variables, $Z_1$ and $Z_2$. Another approach would have been to define the bivariate normal distribution using the joint PDF. The two definitions are equivalent mathematically. In particular, we can state the following theorem.

Theorem


Let $X$ and $Y$ be two bivariate normal random variables, i.e., their joint PDF is given by Equation 5.24. Then there exist independent standard normal random variables $Z_1$ and $Z_2$ such that \begin{align} \nonumber \left\{ \begin{array}{l l} X&=\sigma_X Z_1+\mu_X \\ Y&=\sigma_Y (\rho Z_1 +\sqrt{1-\rho^2} Z_2)+\mu_Y \end{array} \right. \end{align}
Proof. (Sketch) To prove the theorem, define \begin{align} \nonumber \left\{ \begin{array}{l l} Z_1&=\frac{X-\mu_X}{\sigma_X} \\ Z_2&=-\frac{\rho}{\sqrt{1-\rho^2}} \frac{X-\mu_X}{\sigma_X}+\frac{1}{\sqrt{1-\rho^2}}\frac{Y-\mu_Y}{\sigma_Y} \end{array} \right. \end{align} Now find the joint PDF of $Z_1$ and $Z_2$ using the method of transformations (Theorem 5.1), similar to what we did above. You will find out that $Z_1$ and $Z_2$ are independent and standard normal and by definition satisfy the equations of Theorem 5.3. The reason we started our discussion on bivariate normal random variables from $Z_1$ and $Z_2$ is three fold. First, it is more convenient and insightful than the joint PDF formula. Second, sometimes the construction using $Z_1$ and $Z_2$ can be used to solve problems regarding bivariate normal distributions. Third, this method gives us a way to generate samples from the bivariate normal distribution using a computer program. Since most computing packages have a built-in command for independent normal random variable generation, we can simply use this command to generate bivariate normal variables using Equation 5.23. Example
Let $X$ and $Y$ be jointly normal random variables with parameters $\mu_X$, $\sigma^2_X$, $\mu_Y$, $\sigma^2_Y$, and $\rho$. Find the conditional distribution of $Y$ given $X=x$.

  • Solution
    • One way to solve this problem is by using the joint PDF formula (Equation 5.24). In particular, since $X \sim N(\mu_X,\sigma^2_X)$, we can use \begin{align}%\label{} \nonumber f_{Y|X}(y|x)=\frac{f_{XY}(x,y)}{f_X(x)}. \end{align} Another way to solve this problem is to use Theorem 5.3. We can write \begin{align} \nonumber \left\{ \begin{array}{l l} X&=\sigma_X Z_1+\mu_X \\ Y&=\sigma_Y (\rho Z_1 +\sqrt{1-\rho^2} Z_2)+\mu_Y \end{array} \right. \end{align} Thus, given $X=x$, we have \begin{align} \nonumber Z_1=\frac{x-\mu_X}{\sigma_X}, \end{align} and \begin{align} \nonumber Y&=\sigma_Y \rho \frac{x-\mu_X}{\sigma_X} +\sigma_Y \sqrt{1-\rho^2} Z_2+\mu_Y. \end{align} Since $Z_1$ and $Z_2$ are independent, knowing $Z_1$ does not provide any information on $Z_2$. We have shown that given $X=x$, $Y$ is a linear function of $Z_2$, thus it is normal. In particular \begin{align}%\label{} \nonumber E[Y|X=x]&= \sigma_Y \rho \frac{x-\mu_X}{\sigma_X} +\sigma_Y \sqrt{1-\rho^2} E[Z_2]+\mu_Y\\ \nonumber &=\mu_Y+ \rho \sigma_Y \frac{x-\mu_X}{\sigma_X},\\ \nonumber Var(Y|X=x)&= \sigma^2_Y (1-\rho^2) Var(Z_2)\\ \nonumber &=(1-\rho^2)\sigma^2_Y. \end{align} We conclude that given $X=x$, $Y$ is normally distributed with mean $\mu_Y$+ $\rho \sigma_Y \frac{x-\mu_X}{\sigma_X}$ and variance $(1-\rho^2)\sigma^2_Y$.


Theorem
Suppose $X$ and $Y$ are jointly normal random variables with parameters $\mu_X$, $\sigma^2_X$, $\mu_Y$, $\sigma^2_Y$, and $\rho$. Then, given $X=x$, $Y$ is normally distributed with \begin{align}%\label{} \nonumber &E[Y|X=x]=\mu_Y+ \rho \sigma_Y \frac{x-\mu_X}{\sigma_X},\\ \nonumber &Var(Y|X=x)=(1-\rho^2)\sigma^2_Y. \end{align}


Example
Let $X$ and $Y$ be jointly normal random variables with parameters $\mu_X=1$, $\sigma^2_X=1$, $\mu_Y=0$, $\sigma^2_Y=4$, and $\rho=\frac{1}{2}$.
  1. Find $P(2X+Y \leq 3)$.
  2. Find $\textrm{Cov}(X+Y,2X-Y)$.
  3. Find $P(Y>1|X=2)$.

  • Solution
      1. Since $X$ and $Y$ are jointly normal, the random variable $V=2X+Y$ is normal. We have \begin{align}%\label{} \nonumber &EV=2EX+EY=2, \end{align} \begin{align} \nonumber Var(V)&=4Var(X)+Var(Y)+4 \textrm{Cov}(X,Y) \\ \nonumber &=4+4+4 \sigma_X \sigma_Y \rho(X,Y)\\ \nonumber &=8+4 \times 1\times2\times\frac{1}{2}\\ \nonumber &=12. \end{align} Thus, $V \sim N(2,12)$. Therefore, \begin{align}%\label{} \nonumber P(V \leq 3)=\Phi\left(\frac{3-2}{\sqrt{12}}\right)=\Phi\left(\frac{1}{\sqrt{12}}\right)=0.6136 \end{align}
      2. Note that $\textrm{Cov}(X,Y)=\sigma_X \sigma_Y \rho(X,Y)=1$. We have \begin{align} \nonumber \textrm{Cov}(X+Y,2X-Y)&=2\textrm{Cov}(X,X)-\textrm{Cov}(X,Y)+2\textrm{Cov}(Y,X)-\textrm{Cov}(Y,Y) \\ \nonumber &=2-1+2-4=-1. \end{align}
      3. Using Theorem 5.4, we conclude that given $X=2$, $Y$ is normally distributed with \begin{align}%\label{} \nonumber &E[Y|X=2]=\mu_Y+ \rho \sigma_Y \frac{2-\mu_X}{\sigma_X}=1\\ \nonumber &Var(Y|X=x)=(1-\rho^2)\sigma^2_Y=3. \end{align} Thus \begin{align}%\label{} \nonumber &P(Y>1|X=2)=1-\Phi\left(\frac{1-1}{\sqrt{3}}\right)=\frac{1}{2}. \end{align}


Remember that if two random variables $X$ and $Y$ are independent, then they are uncorrelated, i.e., $\textrm{Cov}(X,Y)=0$. However, the converse is not true in general. In the case of jointly normal random variables, the converse is true. Thus, for jointly normal random variables, being independent and being uncorrelated are equivalent.

Theorem
If $X$ and $Y$ are bivariate normal and uncorrelated, then they are independent.


Proof. Since $X$ and $Y$ are uncorrelated, we have $\rho(X,Y)=0$. By Theorem 5.4, given $X=x$, $Y$ is normally distributed with \begin{align}%\label{} \nonumber &E[Y|X=x]=\mu_Y+ \rho \sigma_Y \frac{x-\mu_X}{\sigma_X}=\mu_Y,\\ \nonumber &Var(Y|X=x)=(1-\rho^2)\sigma^2_Y=\sigma^2_Y. \end{align} Thus, $f_{Y|X}(y|x)=f_Y(y)$ for all $x,y \in \mathbb{R}$. Thus $X$ and $Y$ are independent. Another way to prove the theorem is to let $\rho=0$ in Equation 5.24 and observe that $f_{XY}(x,y)=f_X(x)f_Y(y)$.