# Bounds on Correlation

## Contents

# 11.4. Bounds on Correlation#

For a random pair \((X, Y)\), the correlation \(r(X, Y)\) is defined as

where \(X^*\) and \(Y^*\) are respectively \(X\) and \(Y\) measured in standard units.

Thus by definition, correlation is a pure number and has no units.

You have seen several properties of correlation in Data 8. Some are obvious, such as \(r(X, Y) = r(Y, X)\). Some require proof.

In this brief section we will prove one principal property, which is that correlation is a number between \(-1\) and \(1\). You will prove a few other properties in exercises. In the next section we will specify the sense in which correlation measures clustering about a straight line.

## 11.4.1. Lower Bound#

As a preliminary, recall that

So also \(E(Y^*) = 0\) and \(E\left({Y^*}^2\right) = 1\).

We know the expected squares, and what we need is a bound on the expected product \(E(X^*Y^*)\). A result that connects the squares and the product of two numbers is \((a+b)^2 = a^2 + 2ab + b^2\).

So let’s find \(E\left((X^* + Y^*)^2\right)\) and see what that gives us.

Since \(E\big{(}(X^* + Y^*)^2\big{)} \ge 0\), we have

which is the same as

## 11.4.2. Upper Bound#

Play the same game with \(E\big{(}(X^* - Y^*)^2\big{)}\) to see that

which is the same as

because division by \(-2\) flips the direction of the inequality.

## 11.4.3. Other Properties#

As you know from Data 8, correlation measures linear association. In exercises you will show that if \(Y\) is a linear function of \(X\) then \(r(X, Y)\) is either \(1\) or \(-1\).

You will also find the relation between \(r(X, Y)\) and \(r(X, W)\) where \(W\) is a linear function of \(Y\).

In the next section we will return to regression and formalize the idea that correlation measures clustering about a straight line. Our result will imply that if \(r(X, Y)\) is either \(1\) or \(-1\), then the relation between \(X\) and \(Y\) must be perfectly linear.